Hello everyone,

I have two questions about Parquet File format:
1. Where is the parquet dictionary is stored in ParquetFile? Is it stored
in the Footer of the file?  Or is it stored in each page?
2. When Spark reads a Parquet File, how is an RDD partitioned to read a
ParquetFile? Does it allocate one RDD partition per Parquet File? Or per
page? or per Page group? or per Block?

I would appreciate it if anyone can help me with these questions.

Regards
Mania

Reply via email to