I'm in favor of the curated bundle approach. A headline version
that groups several features together is much easier to reason
about than a time-based snapshot of current implementation support.

To me a curated version draws a line in the sand:
"these are the features that go together, here's what to target." A
calendar preset
looks post-facto at the ecosystem and labels the result. Those could end
up bundling the same things, but in the preset model the date itself is
just another
arbitrary gating mechanism, dressed up as a non-decision. Instead of trying
to get things
in to be V3-compatible, we'd be trying to get things done by a specific
date to
be 2026-08-compatible which feels like the same pressure to me, just
relocated.

Of the two, I'd rather an implementation feel the impetus to update and
be compatible with a published bundle, than maintainers who've already
 shipped a feature feel the impetus to push it into other implementations
so it lands in the next preset.

I do take Antoine's point that we've struggled to commit to periodic
version
cuts in the past. I'd rather we fix that than codify the inability as the
design.
A published "V3 contains these features" also gives implementations
something
concrete to aim at before the date arrives, which a calendar snapshot can't.

I'm really glad to see this is coming up for discussion though since I
think it's critical for
us to actually make progress at the speed required to keep up with other
new formats.

On Tue, Jun 9, 2026 at 9:34 AM Antoine Pitrou <[email protected]> wrote:

> Le 09/06/2026 à 15:18, Andrew Lamb a écrit :
> > While working to document what features are forwards incompatible and
> what
> > parquet-format version they were introduced in[1], it occurs to me that
> we
> > **already have** a versioning scheme that is frequently released, time
> > based and clearly defines feature levels:
> >
> > parquet-format version (e.g. 2.11, 2.12, etc).  <---- We already have
> this!
>
> Well, I don't understand how 2.11 is "time-based". The parquet-format
> repo doesn't even have a periodic release schedule.
>
> > The only missing piece is that parquet-format version is not recorded in
> > the metadata itself.
>
> Aren't we moving the goalposts here?
>
> IIRC the basis for this discussion was to inform Parquet *writers* about
> which features can safely be enabled. Recording the format version in a
> Parquet file's metadata does not help achieve that.
>
> And why would a Parquet reader bother checking the version? Usually,
> there is no format version that is an exact match for a Parquet reader
> implementation's feature set.
>
> > ps. If you squint, I think the parquet-format versions look a lot like a
> > combination of "preset" and versions, which is also a nice property.
>
> You really have to squint *a lot* for that to work...
>
> Regards
>
> Antoine.
>
>
>

Reply via email to