Hi Micah,
Thanks a lot for taking the time to do this.
I think this is a good idea on the principle. However, maintenance-wise,
I think it might be more future-proof to start with YAML files and have
the rendering done by a Python script.
(rationale: 1) JSON is annoying to edit manually and doesn't support
comments; 2) ingesting and processing data from a HTML templating
language is cumbersome)
As for point 2 below, I might also remind people of the idea I proposed
some while ago on the issue tracker, namely to define calendar-based
"presets" based on feature availability in the ~3 main open source
Parquet implementations:
https://github.com/apache/parquet-format/issues/384#issuecomment-3406653123
Regards
Antoine.
Le 10/12/2025 à 02:02, Micah Kornfield a écrit :
Hi Parquet Dev,
I put a draft PR together [1] to refactor the implementation status page
[2] to use JSON as data layer and render it using hugo code.
My rationale for doing this:
1. I think in the long run it will make it easier to review and make small
updates from engines (mostly be comparing new/updates rows of JSON for a
single engine). Adding new data for a new engine would only have to touch a
file specific to that engine after it is first registered.
2. I think it is a good idea to start collecting more metadata (in
particular version number/release date) for implementations. I think
displaying/tracking this might get tricky without a more structured
approach. In particular I think having different pivots on features/dates
would be useful.
Developing locally, the current iteration of the change generates
effectively the same visible content. I also tried to add hyperlinks from
notes to the actual note (at least on my browser these seem a bit weird as
they scroll just slightly past the note).
Any objections to proceeding with this type of change?
Thanks,
Micah
[1] https://github.com/apache/parquet-site/pull/143
[2] https://parquet.apache.org/docs/file-format/implementationstatus/