Hello experts,

I know that Gzip and snappy files are not splittable i.e data won't be
distributed into multiple blocks rather it would try to load the data in a
single partition/block

So, my question is when I write the parquet data via spark it gets stored
at the destination with something like *part*.snappy.parquet*

So, when I read this data will it affect my performance?

Please help me if there is any understanding gap.

Thanks,
Sid

Reply via email to