Thanks again for writing this up Arnav -- it is greatly appreciated

I have filed a ticket[1] in arrow-rs to track prototyping ALP in the Rust
Parquet reader if anyone is interested

Andrew

[1]:  https://github.com/apache/arrow-rs/issues/8748

On Wed, Oct 29, 2025 at 1:41 AM Arnav Balyan <[email protected]> wrote:

> Hi team,
>
> Just wanted to start a discussion for FSST integration in Parquet. For
> quick context, FSST (Fast Static Symbol Table) enables high compression
> ratios for unstructured textual data. It's used by other systems like
> DuckDB and MonetDB offering upto 3.3x compression ratios with minimal
> read/write overheads.
>
> We integrated FSST for Parquet and did benchmarks on Parquet, attached here
> is a doc with our findings and results.
>
> https://docs.google.com/document/d/1g7zgopxeHc5nofJXfc8EEp_HGMaI8g-jFVvNCs2GVA0/edit?tab=t.0#heading=h.2eyxl5kkyzy7
>
> Would love to know your suggestions, feedback and thoughts.
> Regards,
>
>  - Arnav
>

Reply via email to