Re: A handy tool called spark-column-analyser

2024-05-21 Thread ashok34...@yahoo.com.INVALID
Great work. Very handy for identifying problems thanks On Tuesday 21 May 2024 at 18:12:15 BST, Mich Talebzadeh wrote: A colleague kindly pointed out about giving an example of output which wll be added to README Doing analysis for column Postcode Json formatted output {    "Postcode":

Re: A handy tool called spark-column-analyser

2024-05-21 Thread Mich Talebzadeh
A colleague kindly pointed out about giving an example of output which wll be added to README Doing analysis for column Postcode Json formatted output { "Postcode": { "exists": true, "num_rows": 93348, "data_type": "string", "null_count": 21921, "null_

A handy tool called spark-column-analyser

2024-05-21 Thread Mich Talebzadeh
I just wanted to share a tool I built called *spark-column-analyzer*. It's a Python package that helps you dig into your Spark DataFrames with ease. Ever spend ages figuring out what's going on in your columns? Like, how many null values are there, or how many unique entries? Built with data prepa