Ryan and Dan made a great point on the call the other day that there are
two categories of new features:
- backwards **compatible**: old readers can still read files (e.g.
PageIndex, new logical types)
- backwards **incompatible**: old readers can not still read the files
(e.g. new encodings, proposed path_in_schema removal, ...)

The recently approved new features / changes we have added to the spec
recently are mostly **backwards compatible** (e.g. Variant) and thus didn't
need ecosystem wide coordination

I think there is more friction on new incompatible changes (older readers
will not be able to read files written with these features)

I agree with Dan, Ryan and others that unless we define some signal in the
file itself (e.g. version 3 😬) it will be close to impossible for users to
understand which features are compatible with other systems

To help this process along, I made a PR to document more clearly which
features are in which version 1 / version 2[1] that I think will help. I
also drafted an example of what "V3" could look like [2].

Andrew

[1]: https://github.com/apache/parquet-site/pull/186
[2]: https://github.com/alamb/parquet-site/pull/1

On Fri, Jun 5, 2026 at 8:39 AM Antoine Pitrou <[email protected]> wrote:

>
> The purpose of the presets proposal is not to inform readers but to help
> users make a decision about which features to enable when writing a
> Parquet file.
>
> For example, a user of PyArrow could, instead of passing an elaborate
> set of flags, call `pq.write_table(tab, 'file.pq', preset='2024-01')`.
>
> Regards
>
> Antoine.
>
>
> Le 05/06/2026 à 00:01, Andrew Bell a écrit :
> > How can a reader know that it has the tooling to read a file with this
> > approach? What is the hesitation to change version numbers?
> >
> > --
> >
> > Andrew Bell
> > [email protected]
> >
> > On Thu, Jun 4, 2026, 4:37 PM Ed Seidl <[email protected]> wrote:
> >
> >> On 2026/06/04 20:17:45 Ryan Blue wrote:
> >>> What's a preset? Could you describe the idea in this discussion so we
> can
> >>> keep it in one place?
> >>>
> >>
> >> The concept was introduced earlier in this thread by Antoine.
> >> https://lists.apache.org/thread/gvw48wrkhgl83jljhd1hzb668ys9zvqx
> >>
> >> Cheers,
> >> Ed
> >>
> >
>
>
>

Reply via email to