I think cascaded encodings would be a good idea in the long run.  I worry a
little bit that there are dependencies on in-flight encoding proposals a
little and it would be nice to focus on landing those before moving to
something more complex.

On Mon, Dec 8, 2025 at 11:31 PM Arnav Balyan <[email protected]> wrote:

> Hi Antoine,
> Thanks for the review, I'll add this data shortly.
>
> On Mon, Dec 8, 2025 at 4:18 PM Antoine Pitrou <[email protected]> wrote:
>
> >
> > Hello Arnav,
> >
> > Was any additional compression applied? I could not find any
> > information in the document.
> >
> > Ideally, for numerical columns I think the following configurations
> > should be compared:
> >
> > - PLAIN
> > - PLAIN + ZSTD
> > - BYTE_STREAM_SPLIT + ZSTD
> > - DELTA + RLE
> > - DELTA + ZSTD
> >
> > For strings you might want to compare the following:
> >
> > - PLAIN
> > - PLAIN + ZSTD
> > - DELTA_BYTE_ARRAY
> > - DELTA_BYTE_ARRAY + ZSTD
> > - DICT
> > - DICT + FSST
> > - DICT + ZSTD
> >
> > Regards
> >
> > Antoine.
> >
> >
> > On Mon, 8 Dec 2025 15:14:20 +0530
> > Arnav Balyan <[email protected]>
> > wrote:
> > > Hi team, thanks to the very valuable reviews and feedback from Juliean,
> > > Micah, Adnrew and others, the FSST proposal is in the PoC stage, and
> will
> > > be worked upon in the coming weeks.
> > >
> > > I just wanted to start a discussion on Composite encodings for Parquet
> > and
> > > get the community's thoughts, feedback and suggestions on nested
> > encodings.
> > >
> > > Nested/Composite/Hierarchical encodings are supported in Vortex,
> > Fastlanes
> > > etc, and partly supported in Parquet (with Dict + RLE). This
> > > proposal discusses formalizing the same and paving way for future
> > encodings
> > > like Dict + FSST, Delta + RLE and others.
> > >
> > > Several benchmarks were run on some well recognized nested encodings,
> and
> > > show significant compression gains (order of 10x improvements) which
> are
> > > further detailed in the doc.
> > >
> > > Would love to get your thoughts and feedback!
> > >
> >
> https://docs.google.com/document/d/1Yi5JwpKEsRFw7D8-iETguRDPtjlyiKITCguYUrrzEVY
> > >
> > > Regards,
> > >  - Arnav
> > >
> >
> >
> >
> >
>

Reply via email to