I think cascaded encodings would be a good idea in the long run. I worry a little bit that there are dependencies on in-flight encoding proposals a little and it would be nice to focus on landing those before moving to something more complex.
On Mon, Dec 8, 2025 at 11:31 PM Arnav Balyan <[email protected]> wrote: > Hi Antoine, > Thanks for the review, I'll add this data shortly. > > On Mon, Dec 8, 2025 at 4:18 PM Antoine Pitrou <[email protected]> wrote: > > > > > Hello Arnav, > > > > Was any additional compression applied? I could not find any > > information in the document. > > > > Ideally, for numerical columns I think the following configurations > > should be compared: > > > > - PLAIN > > - PLAIN + ZSTD > > - BYTE_STREAM_SPLIT + ZSTD > > - DELTA + RLE > > - DELTA + ZSTD > > > > For strings you might want to compare the following: > > > > - PLAIN > > - PLAIN + ZSTD > > - DELTA_BYTE_ARRAY > > - DELTA_BYTE_ARRAY + ZSTD > > - DICT > > - DICT + FSST > > - DICT + ZSTD > > > > Regards > > > > Antoine. > > > > > > On Mon, 8 Dec 2025 15:14:20 +0530 > > Arnav Balyan <[email protected]> > > wrote: > > > Hi team, thanks to the very valuable reviews and feedback from Juliean, > > > Micah, Adnrew and others, the FSST proposal is in the PoC stage, and > will > > > be worked upon in the coming weeks. > > > > > > I just wanted to start a discussion on Composite encodings for Parquet > > and > > > get the community's thoughts, feedback and suggestions on nested > > encodings. > > > > > > Nested/Composite/Hierarchical encodings are supported in Vortex, > > Fastlanes > > > etc, and partly supported in Parquet (with Dict + RLE). This > > > proposal discusses formalizing the same and paving way for future > > encodings > > > like Dict + FSST, Delta + RLE and others. > > > > > > Several benchmarks were run on some well recognized nested encodings, > and > > > show significant compression gains (order of 10x improvements) which > are > > > further detailed in the doc. > > > > > > Would love to get your thoughts and feedback! > > > > > > https://docs.google.com/document/d/1Yi5JwpKEsRFw7D8-iETguRDPtjlyiKITCguYUrrzEVY > > > > > > Regards, > > > - Arnav > > > > > > > > > > > >
