> So really "closing out 2.0" in my mind mostly makes any existing > distinction between V1 and V2 disappear for downstream consumers of Parquet.
I personally think stopping discussing "V2" vs "V1" would improve the overall understanding of the state of Parquet implementations. On Mon, Dec 8, 2025 at 3:46 AM Micah Kornfield <[email protected]> wrote: > Hi Antoine, > > The parquet-format source tree is already versioned, do we really need > > something else? > > > At this point, I'm hoping not. But there have been prior attempts to define > what V2 is (or at least core features [1]). I think two things have > happened over the course of time: > > 1. We've de-emphasized versioning in general and are now trying to document > feature support explicitly [2] > 2. Over the past few years most OSS implementations we know about actually > support most of the initial novelties introduced as part of the V2 effort. > > > > I agree with the changes you propose, but I'd rather we refrain from > > branding it as "V2". > > > Agreed, I don't want to brand this as anything or make a big deal about it. > I think the proposed changes try to de-emphasize Parquet V2/2.0. Please let > me know if there are other places where you think we can improve this. > > So really "closing out 2.0" in my mind mostly makes any existing > distinction between V1 and V2 disappear for downstream consumers of > Parquet. > > Cheers, > Micah > > [1] https://github.com/apache/parquet-format/pull/164 > [2] https://parquet.apache.org/docs/file-format/implementationstatus/ > > On Mon, Dec 8, 2025 at 12:16 AM Antoine Pitrou <[email protected]> wrote: > > > > > The parquet-format source tree is already versioned, do we really need > > something else? > > > > I agree with the changes you propose, but I'd rather we refrain from > > branding it as "V2". > > > > Regards > > > > Antoine. > > > > > > On Fri, 5 Dec 2025 14:55:36 -0800 > > Micah Kornfield <[email protected]> > > wrote: > > > There still appears to be a recurring question for what exactly > > constitutes > > > Parquet 2.0. > > > > > > Given current implementation statuses, my suggestion is to not mention > > 2.0 > > > in general. I've made a proposed changes > > > <https://github.com/apache/parquet-format/pull/535> [1] to this effect > > in a > > > parquet-format repo to try to give guidance that: > > > > > > 1. All encodings documented can now be used regardless of page type. > > > 2. DataPageHeaderV2 is now widely supported by readers > > > 3. Versions should be populated with "1", but readers should accept > "1" > > > and "2". > > > > > > Thoughts? Does this seem like a reasonable path forward? > > > > > > Thanks, > > > Micah > > > > > > > > > [1] https://github.com/apache/parquet-format/pull/535 > > > > > > > > > > > >
