Hi team, thanks to the very valuable reviews and feedback from Juliean,
Micah, Adnrew and others, the FSST proposal is in the PoC stage, and will
be worked upon in the coming weeks.

I just wanted to start a discussion on Composite encodings for Parquet and
get the community's thoughts, feedback and suggestions on nested encodings.

Nested/Composite/Hierarchical encodings are supported in Vortex, Fastlanes
etc, and partly supported in Parquet (with Dict + RLE). This
proposal discusses formalizing the same and paving way for future encodings
like Dict + FSST, Delta + RLE and others.

Several benchmarks were run on some well recognized nested encodings, and
show significant compression gains (order of 10x improvements) which are
further detailed in the doc.

Would love to get your thoughts and feedback!
https://docs.google.com/document/d/1Yi5JwpKEsRFw7D8-iETguRDPtjlyiKITCguYUrrzEVY

Regards,
 - Arnav

Reply via email to