Hello Iceberg devs!

Do any of you folks use the underlying file format as* Parquet + Snappy.*
Iceberg configures this by default as Parquet + gzip (
*write.parquet.compression-codec*).
*Is there any specific reason for this Choice?*

In our preliminary tests we found better numbers with *Parquet + Snappy*
than with *gzip*.
Operation = compress and write to local disk
File Size = 524.3MB (about the same with both the compression codecs)
row group size = 64mb.

gzip snappy
8.304
5.478


We are still in the process of our full benchmarking (for reads) - but,
want to understand - if there is a whole different angle to this that we
are not thinking thru.

Truly appreciate any inputs,
Sreeram

Reply via email to