> So really "closing out 2.0" in my mind mostly makes any existing
> distinction between V1 and V2 disappear for downstream consumers of
Parquet.

I personally think stopping discussing "V2" vs "V1" would improve the
overall understanding of the state of Parquet implementations.



On Mon, Dec 8, 2025 at 3:46 AM Micah Kornfield <[email protected]>
wrote:

> Hi Antoine,
>
> The parquet-format source tree is already versioned, do we really need
> > something else?
>
>
> At this point, I'm hoping not. But there have been prior attempts to define
> what V2 is (or at least core features [1]).   I think two things have
> happened over the course of time:
>
> 1. We've de-emphasized versioning in general and are now trying to document
> feature support explicitly [2]
> 2. Over the past few years most OSS implementations we know about actually
> support most of the initial novelties introduced as part of the V2 effort.
>
>
> > I agree with the changes you propose, but I'd rather we refrain from
> > branding it as "V2".
>
>
> Agreed, I don't want to brand this as anything or make a big deal about it.
> I think the proposed changes try to de-emphasize Parquet V2/2.0. Please let
> me know if there are other places where you think we can improve this.
>
> So really "closing out 2.0" in my mind mostly makes any existing
> distinction between V1 and V2 disappear for downstream consumers of
> Parquet.
>
> Cheers,
> Micah
>
> [1] https://github.com/apache/parquet-format/pull/164
> [2] https://parquet.apache.org/docs/file-format/implementationstatus/
>
> On Mon, Dec 8, 2025 at 12:16 AM Antoine Pitrou <[email protected]> wrote:
>
> >
> > The parquet-format source tree is already versioned, do we really need
> > something else?
> >
> > I agree with the changes you propose, but I'd rather we refrain from
> > branding it as "V2".
> >
> > Regards
> >
> > Antoine.
> >
> >
> > On Fri, 5 Dec 2025 14:55:36 -0800
> > Micah Kornfield <[email protected]>
> > wrote:
> > > There still appears to be a recurring question for what exactly
> > constitutes
> > > Parquet 2.0.
> > >
> > > Given current implementation statuses, my suggestion is to not mention
> > 2.0
> > > in general.  I've made a proposed changes
> > > <https://github.com/apache/parquet-format/pull/535> [1] to this effect
> > in a
> > > parquet-format repo to try to give guidance that:
> > >
> > > 1.  All encodings documented can now be used regardless of page type.
> > > 2.  DataPageHeaderV2 is now widely supported by readers
> > > 3.  Versions should be populated with "1", but readers should accept
> "1"
> > > and "2".
> > >
> > > Thoughts?  Does this seem like a reasonable path forward?
> > >
> > > Thanks,
> > > Micah
> > >
> > >
> > > [1] https://github.com/apache/parquet-format/pull/535
> > >
> >
> >
> >
> >
>

Reply via email to