For normalization I agree with Ryan. I was part of those other discussions and I think it does seem like this is an engine concern and not a storage one.
I'm also ok with basically getting no value from min/max of non-shredded fields. On Wed, Dec 11, 2024 at 4:35 AM Antoine Pitrou <[email protected]> wrote: > On Mon, 9 Dec 2024 16:33:51 -0800 > "[email protected]" > <[email protected]> wrote: > > I think that Parquet should exactly reproduce the data that is written to > > files, rather than either allowing or requiring Parquet implementations > to > > normalize types. To me, that's a fundamental guarantee of the storage > > layer. The compute layer can decide to normalize types and take actions > to > > make storage more efficient, but storage should not modify the data that > is > > passed to it. > > FWIW, I agree with this. > > Regards > > Antoine. > > >
