Ryan and Dan made a great point on the call the other day that there are two categories of new features: - backwards **compatible**: old readers can still read files (e.g. PageIndex, new logical types) - backwards **incompatible**: old readers can not still read the files (e.g. new encodings, proposed path_in_schema removal, ...)
The recently approved new features / changes we have added to the spec recently are mostly **backwards compatible** (e.g. Variant) and thus didn't need ecosystem wide coordination I think there is more friction on new incompatible changes (older readers will not be able to read files written with these features) I agree with Dan, Ryan and others that unless we define some signal in the file itself (e.g. version 3 😬) it will be close to impossible for users to understand which features are compatible with other systems To help this process along, I made a PR to document more clearly which features are in which version 1 / version 2[1] that I think will help. I also drafted an example of what "V3" could look like [2]. Andrew [1]: https://github.com/apache/parquet-site/pull/186 [2]: https://github.com/alamb/parquet-site/pull/1 On Fri, Jun 5, 2026 at 8:39 AM Antoine Pitrou <[email protected]> wrote: > > The purpose of the presets proposal is not to inform readers but to help > users make a decision about which features to enable when writing a > Parquet file. > > For example, a user of PyArrow could, instead of passing an elaborate > set of flags, call `pq.write_table(tab, 'file.pq', preset='2024-01')`. > > Regards > > Antoine. > > > Le 05/06/2026 à 00:01, Andrew Bell a écrit : > > How can a reader know that it has the tooling to read a file with this > > approach? What is the hesitation to change version numbers? > > > > -- > > > > Andrew Bell > > [email protected] > > > > On Thu, Jun 4, 2026, 4:37 PM Ed Seidl <[email protected]> wrote: > > > >> On 2026/06/04 20:17:45 Ryan Blue wrote: > >>> What's a preset? Could you describe the idea in this discussion so we > can > >>> keep it in one place? > >>> > >> > >> The concept was introduced earlier in this thread by Antoine. > >> https://lists.apache.org/thread/gvw48wrkhgl83jljhd1hzb668ys9zvqx > >> > >> Cheers, > >> Ed > >> > > > > >
