I just wanted to give you a heads up. I have sent a new PR [1] for review
that adds versioning to the data-model for supported features.  I used AI
to partially backfill data and spot checks to make sure it looks
reasonable. There is definitely missing data.

With this change I also added a table that tries to summarize engine reader
ability feature year.  The table is pessimistic in that it assumes if it
doesn't have data, that only the most recent library/engine.  Going forward
I expect the data to be accurate so there are two paths forward (and we
should pursue both):

1.  Improve the accuracy of the data by backfilling releases that have a
certain version.
2.  Make sure we populate version data as new features are added.

Without 1, at least we have a baseline of what is readable as of 2025.

I don't plan on investing too much more time here but if people notice bugs
related to the infrastructure please raise issues and tag me.

Thanks,
Micah

[1] https://github.com/apache/parquet-site/pull/144

Reply via email to