I think Ashish's question was about determining the right configuration in
the first place - IIUC parquet-rewrite requires the user to pass these in.

I'm not aware of any tool to choose good Parquet configurations
automatically. I sometimes use the parquet-tools pip package / CLI to
inspect Parquet and see how files are configured, but I've only tuned
manually.

On Tue, May 27, 2025, 16:22 Andrew Lamb <[email protected]> wrote:

> We have one in the arrow-rs repository: parquet-rewrite[1]
>
>
>
> [1]:
>
> https://github.com/apache/arrow-rs/blob/0da003becbd6489f483b70e5914a242edd8c6d1a/parquet/src/bin/parquet-rewrite.rs#L18
>
> On Tue, May 27, 2025 at 12:41 PM Ashish Singh <[email protected]> wrote:
>
> > Hey all,
> >
> > Is there any tool/ lib folks use to tune parquet configs to optimize for
> > storage size / read/ write speed?
> >
> > - Ashish
> >
>

Reply via email to