Hi Micah,

Thanks a lot for taking the time to do this.

I think this is a good idea on the principle. However, maintenance-wise, I think it might be more future-proof to start with YAML files and have the rendering done by a Python script.

(rationale: 1) JSON is annoying to edit manually and doesn't support comments; 2) ingesting and processing data from a HTML templating language is cumbersome)

As for point 2 below, I might also remind people of the idea I proposed some while ago on the issue tracker, namely to define calendar-based "presets" based on feature availability in the ~3 main open source Parquet implementations:
https://github.com/apache/parquet-format/issues/384#issuecomment-3406653123

Regards

Antoine.


Le 10/12/2025 à 02:02, Micah Kornfield a écrit :
Hi Parquet Dev,
I put a draft PR together [1] to refactor the implementation status page
[2] to use JSON as data layer and render it using hugo code.

My rationale for doing this:
1.  I think in the long run it will make it easier to review and make small
updates from engines  (mostly be comparing new/updates rows of JSON for a
single engine). Adding new data for a new engine would only have to touch a
file specific to that engine after it is first registered.
2.  I think it is a good idea to start collecting more metadata (in
particular version number/release date) for implementations.  I think
displaying/tracking this might get tricky without a more structured
approach.  In particular I think having different pivots on features/dates
would be useful.

Developing locally, the current iteration of the change generates
effectively the same visible content. I also tried to add hyperlinks from
notes to the actual note (at least on my browser these seem a bit weird as
they scroll just slightly past the note).

Any objections to proceeding with this type of change?

Thanks,
Micah

[1] https://github.com/apache/parquet-site/pull/143
[2] https://parquet.apache.org/docs/file-format/implementationstatus/




Reply via email to