Hey Julien, Yes completely agreed, we are past the DISCUSS phase and into the DRAFT/POC phase. I’ll raise the PR shortly to add the FSST proposal to the list.
Warm regards, Arnav On Thu, Nov 20, 2025 at 9:47 PM Julien Le Dem <[email protected]> wrote: > @Arnav Balyan <[email protected]> : same question as on the FSST > thread: Would you agree that we are past the DISCUSS step and into the > DRAFT/POC phase according to the proposals process > <https://github.com/apache/parquet-format/tree/master/proposals>? If yes, > could you open a PR on this page to add this proposal to the list? > https://github.com/apache/parquet-format/tree/master/proposals > Thank you! > > On Thu, Oct 30, 2025 at 6:55 PM Arnav Balyan <[email protected]> > wrote: > > > Hi Andrew, > > Thanks for filing the ticket, really appreciate it. > > I’ll check out the Rust side of things and get the performance numbers. I > > would expect it to do better in Rust, but will share once the results are > > ready. > > > > > > Thanks. > > Arnav > > > > On Fri, Oct 31, 2025 at 3:07 AM Andrew Lamb <[email protected]> > > wrote: > > > > > I meant to point at the FSST ticket: > > > https://github.com/apache/arrow-rs/issues/8749 (I am already getting > > > confused) > > > > > > On Thu, Oct 30, 2025 at 5:36 PM Andrew Lamb <[email protected]> > > > wrote: > > > > > > > Thanks again for writing this up Arnav -- it is greatly appreciated > > > > > > > > I have filed a ticket[1] in arrow-rs to track prototyping ALP in the > > Rust > > > > Parquet reader if anyone is interested > > > > > > > > Andrew > > > > > > > > [1]: https://github.com/apache/arrow-rs/issues/8748 > > > > > > > > On Wed, Oct 29, 2025 at 1:41 AM Arnav Balyan <[email protected] > > > > > > wrote: > > > > > > > >> Hi team, > > > >> > > > >> Just wanted to start a discussion for FSST integration in Parquet. > For > > > >> quick context, FSST (Fast Static Symbol Table) enables high > > compression > > > >> ratios for unstructured textual data. It's used by other systems > like > > > >> DuckDB and MonetDB offering upto 3.3x compression ratios with > minimal > > > >> read/write overheads. > > > >> > > > >> We integrated FSST for Parquet and did benchmarks on Parquet, > attached > > > >> here > > > >> is a doc with our findings and results. > > > >> > > > >> > > > > > > https://docs.google.com/document/d/1g7zgopxeHc5nofJXfc8EEp_HGMaI8g-jFVvNCs2GVA0/edit?tab=t.0#heading=h.2eyxl5kkyzy7 > > > >> > > > >> Would love to know your suggestions, feedback and thoughts. > > > >> Regards, > > > >> > > > >> - Arnav > > > >> > > > > > > > > > >
