On Tue, Jun 9, 2026 at 10:35 AM Antoine Pitrou <[email protected]> wrote:
> Le 09/06/2026 à 15:18, Andrew Lamb a écrit : > > While working to document what features are forwards incompatible and > what > > parquet-format version they were introduced in[1], it occurs to me that > we > > **already have** a versioning scheme that is frequently released, time > > based and clearly defines feature levels: > > > > parquet-format version (e.g. 2.11, 2.12, etc). <---- We already have > this! > > Well, I don't understand how 2.11 is "time-based". The parquet-format > repo doesn't even have a periodic release schedule. > > > The only missing piece is that parquet-format version is not recorded in > > the metadata itself. > > Aren't we moving the goalposts here? > > IIRC the basis for this discussion was to inform Parquet *writers* about > which features can safely be enabled. Recording the format version in a > Parquet file's metadata does not help achieve that. > Huh? I'm totally lost. How do you inform a writer about anything? The writer must conform to some specification, right? And it would be good if the writer included some sort of indicator of that version so that readers will know if they can consume a file and perhaps what code needs to be invoked in order to do so. What is the point of saying "Well, it's March 15 so there are some new things that might now be part of a Parquet file"? I just don't get it. I find Parquet versioning difficult enough as AFAICT the only real versioning is a git tag in parquet-format repo, correct? There is no easy-to-read list of changes unless I am missing something. -- Andrew Bell [email protected]
