Thanks Alkis! I so wanted to love this proposal, but unfortunately I don't think it will work.
> 2. Mark FileMetaData.version optional in thrift. A writer that sets the > bundle field omits version. A file carries exactly one of the two. > > The trick is (2): the deployed readers I checked hard-fail at footer parse > when FileMetaData.version is missing: parquet-java, arrow-cpp, parquet-rs > and DuckDB. They all enforce its presence even though the spec says to > ignore its value. Old readers fail immediately on open instead of tripping > on obscure errors later, or worse, reading bad data. The problem is that the footer metadata is written and parsed depth first. Validation only happens after all of the fields of a struct have been read. So even if "version" is missing, an old reader won't know this until well after the row group metadata has been parsed. If "path_in_schema" is missing as well, that error will be thrown first. The only ways I can think to make this work are pretty convoluted. I can share my ideas in a separate thread if there's interest. Cheers, Ed
