Hi Andrew,
Thanks for filing the ticket, really appreciate it.
I’ll check out the Rust side of things and get the performance numbers. I
would expect it to do better in Rust, but will share once the results are
ready.


Thanks.
Arnav

On Fri, Oct 31, 2025 at 3:07 AM Andrew Lamb <[email protected]> wrote:

> I meant to point at the FSST ticket:
> https://github.com/apache/arrow-rs/issues/8749 (I am already getting
> confused)
>
> On Thu, Oct 30, 2025 at 5:36 PM Andrew Lamb <[email protected]>
> wrote:
>
> > Thanks again for writing this up Arnav -- it is greatly appreciated
> >
> > I have filed a ticket[1] in arrow-rs to track prototyping ALP in the Rust
> > Parquet reader if anyone is interested
> >
> > Andrew
> >
> > [1]:  https://github.com/apache/arrow-rs/issues/8748
> >
> > On Wed, Oct 29, 2025 at 1:41 AM Arnav Balyan <[email protected]>
> > wrote:
> >
> >> Hi team,
> >>
> >> Just wanted to start a discussion for FSST integration in Parquet. For
> >> quick context, FSST (Fast Static Symbol Table) enables high compression
> >> ratios for unstructured textual data. It's used by other systems like
> >> DuckDB and MonetDB offering upto 3.3x compression ratios with minimal
> >> read/write overheads.
> >>
> >> We integrated FSST for Parquet and did benchmarks on Parquet, attached
> >> here
> >> is a doc with our findings and results.
> >>
> >>
> https://docs.google.com/document/d/1g7zgopxeHc5nofJXfc8EEp_HGMaI8g-jFVvNCs2GVA0/edit?tab=t.0#heading=h.2eyxl5kkyzy7
> >>
> >> Would love to know your suggestions, feedback and thoughts.
> >> Regards,
> >>
> >>  - Arnav
> >>
> >
>

Reply via email to