I think Ashish's question was about determining the right configuration in the first place - IIUC parquet-rewrite requires the user to pass these in.
I'm not aware of any tool to choose good Parquet configurations automatically. I sometimes use the parquet-tools pip package / CLI to inspect Parquet and see how files are configured, but I've only tuned manually. On Tue, May 27, 2025, 16:22 Andrew Lamb <[email protected]> wrote: > We have one in the arrow-rs repository: parquet-rewrite[1] > > > > [1]: > > https://github.com/apache/arrow-rs/blob/0da003becbd6489f483b70e5914a242edd8c6d1a/parquet/src/bin/parquet-rewrite.rs#L18 > > On Tue, May 27, 2025 at 12:41 PM Ashish Singh <[email protected]> wrote: > > > Hey all, > > > > Is there any tool/ lib folks use to tune parquet configs to optimize for > > storage size / read/ write speed? > > > > - Ashish > > >
