I just wanted to give you a heads up. I have sent a new PR [1] for review that adds versioning to the data-model for supported features. I used AI to partially backfill data and spot checks to make sure it looks reasonable. There is definitely missing data.
With this change I also added a table that tries to summarize engine reader ability feature year. The table is pessimistic in that it assumes if it doesn't have data, that only the most recent library/engine. Going forward I expect the data to be accurate so there are two paths forward (and we should pursue both): 1. Improve the accuracy of the data by backfilling releases that have a certain version. 2. Make sure we populate version data as new features are added. Without 1, at least we have a baseline of what is readable as of 2025. I don't plan on investing too much more time here but if people notice bugs related to the infrastructure please raise issues and tag me. Thanks, Micah [1] https://github.com/apache/parquet-site/pull/144
