James, you are doing fine.
Is it possible to post a new blog in the website for this?

> 在 2021年9月29日,20:27,James Turton <dz...@apache.org> 写道:
> 
> Hi all
> 
> We've got support for reading and writing using additional Parquet 
> compression codecs in master now.  Here are the footprints of a 25M record 
> dataset compressed by Drill with different codecs.
> 
> | Codec  | Size on disk (Mb) |
> | ------ | ----------------- |
> | brotli |   87              |
> | gzip   |   80              |
> | lz4    |  100.6            |
> | lzo    |  100.8            |
> | snappy |  192              |
> | zstd   |   85              |
> | none   | 2152              |
> 
> I haven't made measurements of (de)compression speed differences myself but 
> there are many such benchmarks around on the web, and the differences can be 
> big *if* you've got a workload that is CPU bound by (de)compression.  Beyond 
> that there are the usual considerations like better utilisation of the OS 
> page cache by the higher compression ratio codecs, less I/O when data must 
> come from disk, etc.  Zstd is probably the one I'll be putting into 
> `store.parquet.compression` myself at this point.
> 
> Happy Drilling!
> James

Reply via email to