Thanks, Micah, for taking the lead here.

I agree that V2 is a bit meaningless on its own. I think it would be more
valuable to establish a baseline of what's expected to be supported today,
and then we can build from there.

Kind regards,
Fokko

Op ma 8 dec 2025 om 12:30 schreef Andrew Lamb <[email protected]>:

> > So really "closing out 2.0" in my mind mostly makes any existing
> > distinction between V1 and V2 disappear for downstream consumers of
> Parquet.
>
> I personally think stopping discussing "V2" vs "V1" would improve the
> overall understanding of the state of Parquet implementations.
>
>
>
> On Mon, Dec 8, 2025 at 3:46 AM Micah Kornfield <[email protected]>
> wrote:
>
> > Hi Antoine,
> >
> > The parquet-format source tree is already versioned, do we really need
> > > something else?
> >
> >
> > At this point, I'm hoping not. But there have been prior attempts to
> define
> > what V2 is (or at least core features [1]).   I think two things have
> > happened over the course of time:
> >
> > 1. We've de-emphasized versioning in general and are now trying to
> document
> > feature support explicitly [2]
> > 2. Over the past few years most OSS implementations we know about
> actually
> > support most of the initial novelties introduced as part of the V2
> effort.
> >
> >
> > > I agree with the changes you propose, but I'd rather we refrain from
> > > branding it as "V2".
> >
> >
> > Agreed, I don't want to brand this as anything or make a big deal about
> it.
> > I think the proposed changes try to de-emphasize Parquet V2/2.0. Please
> let
> > me know if there are other places where you think we can improve this.
> >
> > So really "closing out 2.0" in my mind mostly makes any existing
> > distinction between V1 and V2 disappear for downstream consumers of
> > Parquet.
> >
> > Cheers,
> > Micah
> >
> > [1] https://github.com/apache/parquet-format/pull/164
> > [2] https://parquet.apache.org/docs/file-format/implementationstatus/
> >
> > On Mon, Dec 8, 2025 at 12:16 AM Antoine Pitrou <[email protected]>
> wrote:
> >
> > >
> > > The parquet-format source tree is already versioned, do we really need
> > > something else?
> > >
> > > I agree with the changes you propose, but I'd rather we refrain from
> > > branding it as "V2".
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > On Fri, 5 Dec 2025 14:55:36 -0800
> > > Micah Kornfield <[email protected]>
> > > wrote:
> > > > There still appears to be a recurring question for what exactly
> > > constitutes
> > > > Parquet 2.0.
> > > >
> > > > Given current implementation statuses, my suggestion is to not
> mention
> > > 2.0
> > > > in general.  I've made a proposed changes
> > > > <https://github.com/apache/parquet-format/pull/535> [1] to this
> effect
> > > in a
> > > > parquet-format repo to try to give guidance that:
> > > >
> > > > 1.  All encodings documented can now be used regardless of page type.
> > > > 2.  DataPageHeaderV2 is now widely supported by readers
> > > > 3.  Versions should be populated with "1", but readers should accept
> > "1"
> > > > and "2".
> > > >
> > > > Thoughts?  Does this seem like a reasonable path forward?
> > > >
> > > > Thanks,
> > > > Micah
> > > >
> > > >
> > > > [1] https://github.com/apache/parquet-format/pull/535
> > > >
> > >
> > >
> > >
> > >
> >
>

Reply via email to