Hi team,

Just wanted to start a discussion for FSST integration in Parquet. For
quick context, FSST (Fast Static Symbol Table) enables high compression
ratios for unstructured textual data. It's used by other systems like
DuckDB and MonetDB offering upto 3.3x compression ratios with minimal
read/write overheads.

We integrated FSST for Parquet and did benchmarks on Parquet, attached here
is a doc with our findings and results.
https://docs.google.com/document/d/1g7zgopxeHc5nofJXfc8EEp_HGMaI8g-jFVvNCs2GVA0/edit?tab=t.0#heading=h.2eyxl5kkyzy7

Would love to know your suggestions, feedback and thoughts.
Regards,

 - Arnav

Reply via email to