Python | Perardua Consultinghttps://www.perarduaconsulting.com/blog/tags/python https://static.wixstatic.com/media/5584dc

7 Easy Techniques to Detect Anomalies in Pandas for Data Analysis

Data analysis is an exciting journey, but it comes with its challenges. One of the biggest hurdles is identifying anomalies—unexpected results that can distort our conclusions and predictions.

Claude Paugh

May 144 min read

20 views

Apache Iceberg and Pandas Analytics: Part II

As I had indicated in Part I, I had built some basic examples with PyIceberg and Python to learn more, and exercise some of the functionality it offers. I started by using data that I collect from time-to-time, for securities, mostly common stocks, and various twelve-month key metrics and analyst forecasts. This is an extension to my SEC filings collection that I have a running series of articles on. I use this particular data to build out details for securities in my Neo4j g

Claude Paugh

May 913 min read

334 views

Gathering Data Statistics Using PySpark: A Comparative Analysis with Scala

Data processing and statistics gathering are essential tasks in today's data-driven world. Engineers frequently find themselves choosing between tools like PySpark and Scala when embarking on these tasks.

Claude Paugh

Apr 155 min read

11 views

Harnessing the Dask Python Library for Parallel Computing

Dask is a flexible library for parallel computing in Python. It is designed to scale from a single machine to a cluster of machines seamlessly. By using Dask, you can manage and manipulate large datasets that are too big to fit into memory on a single machine.

Claude Paugh

Apr 155 min read

6 views

7 Easy Techniques to Detect Anomalies in Pandas for Data Analysis

Apache Iceberg and Pandas Analytics: Part II

Gathering Data Statistics Using PySpark: A Comparative Analysis with Scala

Harnessing the Dask Python Library for Parallel Computing

Privacy Policy