> I agree with the categories, but I want to be careful about terminology. I
> would call these *forward* compatible or *forward* incompatible.

I agree. The forward terminology is more precise (Ed pointed out the same
discrepancy in https://github.com/apache/parquet-site/pull/186) and I will
endeavor to use it going forward



On Fri, Jun 5, 2026 at 3:58 PM Ryan Blue <[email protected]> wrote:

>    - backwards *compatible*: old readers can still read files (e.g.
>    PageIndex, new logical types)
>    - backwards *incompatible*: old readers can not still read the files
>    (e.g. new encodings, proposed path_in_schema removal, …)
>
> I agree with the categories, but I want to be careful about terminology. I
> would call these *forward* compatible or *forward* incompatible. The reason
> is that *backward* compatible usually means that newer versions can
> interact with older data, rather than older versions interacting with newer
> data.
>
> For example, backward compatibility would mean that although a version
> writes DataPageV2, it can still read DataPageV1. On the other hand, forward
> compatibility is when we design features in a way that older readers will
> ignore if they don’t know about them, like additional thrift fields that
> are not necessary for correctly reading the data, but may allow clients to
> find specific data more quickly.
>
> I tend to refer to “forward-incompatible” changes when we’re talking about
> breaking changes that would cause any existing reader to fail or produce
> incorrect results.
>
> Ryan
>
> On Fri, Jun 5, 2026 at 7:14 AM Andrew Lamb <[email protected]> wrote:
>
> > Ryan and Dan made a great point on the call the other day that there are
> > two categories of new features:
> > - backwards **compatible**: old readers can still read files (e.g.
> > PageIndex, new logical types)
> > - backwards **incompatible**: old readers can not still read the files
> > (e.g. new encodings, proposed path_in_schema removal, ...)
> >
> > The recently approved new features / changes we have added to the spec
> > recently are mostly **backwards compatible** (e.g. Variant) and thus
> didn't
> > need ecosystem wide coordination
> >
> > I think there is more friction on new incompatible changes (older readers
> > will not be able to read files written with these features)
> >
> > I agree with Dan, Ryan and others that unless we define some signal in
> the
> > file itself (e.g. version 3 😬) it will be close to impossible for users
> to
> > understand which features are compatible with other systems
> >
> > To help this process along, I made a PR to document more clearly which
> > features are in which version 1 / version 2[1] that I think will help. I
> > also drafted an example of what "V3" could look like [2].
> >
> > Andrew
> >
> > [1]: https://github.com/apache/parquet-site/pull/186
> > [2]: https://github.com/alamb/parquet-site/pull/1
> >
> > On Fri, Jun 5, 2026 at 8:39 AM Antoine Pitrou <[email protected]>
> wrote:
> >
> > >
> > > The purpose of the presets proposal is not to inform readers but to
> help
> > > users make a decision about which features to enable when writing a
> > > Parquet file.
> > >
> > > For example, a user of PyArrow could, instead of passing an elaborate
> > > set of flags, call `pq.write_table(tab, 'file.pq', preset='2024-01')`.
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > Le 05/06/2026 à 00:01, Andrew Bell a écrit :
> > > > How can a reader know that it has the tooling to read a file with
> this
> > > > approach? What is the hesitation to change version numbers?
> > > >
> > > > --
> > > >
> > > > Andrew Bell
> > > > [email protected]
> > > >
> > > > On Thu, Jun 4, 2026, 4:37 PM Ed Seidl <[email protected]> wrote:
> > > >
> > > >> On 2026/06/04 20:17:45 Ryan Blue wrote:
> > > >>> What's a preset? Could you describe the idea in this discussion so
> we
> > > can
> > > >>> keep it in one place?
> > > >>>
> > > >>
> > > >> The concept was introduced earlier in this thread by Antoine.
> > > >> https://lists.apache.org/thread/gvw48wrkhgl83jljhd1hzb668ys9zvqx
> > > >>
> > > >> Cheers,
> > > >> Ed
> > > >>
> > > >
> > >
> > >
> > >
> >
>

Reply via email to