> I agree with the categories, but I want to be careful about terminology. I > would call these *forward* compatible or *forward* incompatible.
I agree. The forward terminology is more precise (Ed pointed out the same discrepancy in https://github.com/apache/parquet-site/pull/186) and I will endeavor to use it going forward On Fri, Jun 5, 2026 at 3:58 PM Ryan Blue <[email protected]> wrote: > - backwards *compatible*: old readers can still read files (e.g. > PageIndex, new logical types) > - backwards *incompatible*: old readers can not still read the files > (e.g. new encodings, proposed path_in_schema removal, …) > > I agree with the categories, but I want to be careful about terminology. I > would call these *forward* compatible or *forward* incompatible. The reason > is that *backward* compatible usually means that newer versions can > interact with older data, rather than older versions interacting with newer > data. > > For example, backward compatibility would mean that although a version > writes DataPageV2, it can still read DataPageV1. On the other hand, forward > compatibility is when we design features in a way that older readers will > ignore if they don’t know about them, like additional thrift fields that > are not necessary for correctly reading the data, but may allow clients to > find specific data more quickly. > > I tend to refer to “forward-incompatible” changes when we’re talking about > breaking changes that would cause any existing reader to fail or produce > incorrect results. > > Ryan > > On Fri, Jun 5, 2026 at 7:14 AM Andrew Lamb <[email protected]> wrote: > > > Ryan and Dan made a great point on the call the other day that there are > > two categories of new features: > > - backwards **compatible**: old readers can still read files (e.g. > > PageIndex, new logical types) > > - backwards **incompatible**: old readers can not still read the files > > (e.g. new encodings, proposed path_in_schema removal, ...) > > > > The recently approved new features / changes we have added to the spec > > recently are mostly **backwards compatible** (e.g. Variant) and thus > didn't > > need ecosystem wide coordination > > > > I think there is more friction on new incompatible changes (older readers > > will not be able to read files written with these features) > > > > I agree with Dan, Ryan and others that unless we define some signal in > the > > file itself (e.g. version 3 😬) it will be close to impossible for users > to > > understand which features are compatible with other systems > > > > To help this process along, I made a PR to document more clearly which > > features are in which version 1 / version 2[1] that I think will help. I > > also drafted an example of what "V3" could look like [2]. > > > > Andrew > > > > [1]: https://github.com/apache/parquet-site/pull/186 > > [2]: https://github.com/alamb/parquet-site/pull/1 > > > > On Fri, Jun 5, 2026 at 8:39 AM Antoine Pitrou <[email protected]> > wrote: > > > > > > > > The purpose of the presets proposal is not to inform readers but to > help > > > users make a decision about which features to enable when writing a > > > Parquet file. > > > > > > For example, a user of PyArrow could, instead of passing an elaborate > > > set of flags, call `pq.write_table(tab, 'file.pq', preset='2024-01')`. > > > > > > Regards > > > > > > Antoine. > > > > > > > > > Le 05/06/2026 à 00:01, Andrew Bell a écrit : > > > > How can a reader know that it has the tooling to read a file with > this > > > > approach? What is the hesitation to change version numbers? > > > > > > > > -- > > > > > > > > Andrew Bell > > > > [email protected] > > > > > > > > On Thu, Jun 4, 2026, 4:37 PM Ed Seidl <[email protected]> wrote: > > > > > > > >> On 2026/06/04 20:17:45 Ryan Blue wrote: > > > >>> What's a preset? Could you describe the idea in this discussion so > we > > > can > > > >>> keep it in one place? > > > >>> > > > >> > > > >> The concept was introduced earlier in this thread by Antoine. > > > >> https://lists.apache.org/thread/gvw48wrkhgl83jljhd1hzb668ys9zvqx > > > >> > > > >> Cheers, > > > >> Ed > > > >> > > > > > > > > > > > > > > > >
