I also really like the idea of a data driven page, thank you for doing this
I agree with Antoine that YAML (with comments) would be preferable than json I also personally slightly prefer using hugo templates rather than a new python script to reduce the number of technologies required to build parquet-site as well as the reasons Micah mentions. Andrew On Wed, Dec 10, 2025 at 12:58 PM Micah Kornfield <[email protected]> wrote: > Hi Antoine, > > Thanks for the feedback. > > > > 1) JSON is annoying to edit manually and doesn't support comments; > > > Happy to switch to YAML. > > 2) ingesting and processing data from a HTML templating > > language is cumbersome) > > > Ingesting is actually not the problem. I agree processing is a little bit > cumbersome but at this point, I'd rather keep this in Hugo for the > following reasons: > > 1. Python or even plain Go code might be preferable, but there is no way > to do this within the Hugo framework. This means there is a trade-off in > developer/CI complexity for keeping a separate script up-to-date (right > now, you can modify the template and Hugo will re-render automatically, > making for a tight dev-loop. > 2. At least at the moment, I don't anticipate a lot of changes needed to > this template, and if we do find maintenance cumbersome it shouldn't be > heavy lift to migrate to python. > > As a path forward, let me see what changes are needed to migrate to YAML, > if these require a lot of changes to the current Hugo template, I can > rewrite it in python. Otherwise, I'd prefer we do the migration when we > observe it is needed. > > Cheers, > Micah > > On Wed, Dec 10, 2025 at 2:38 AM Antoine Pitrou <[email protected]> wrote: > > > > > Hi Micah, > > > > Thanks a lot for taking the time to do this. > > > > I think this is a good idea on the principle. However, maintenance-wise, > > I think it might be more future-proof to start with YAML files and have > > the rendering done by a Python script. > > > > (rationale: 1) JSON is annoying to edit manually and doesn't support > > comments; 2) ingesting and processing data from a HTML templating > > language is cumbersome) > > > > As for point 2 below, I might also remind people of the idea I proposed > > some while ago on the issue tracker, namely to define calendar-based > > "presets" based on feature availability in the ~3 main open source > > Parquet implementations: > > > https://github.com/apache/parquet-format/issues/384#issuecomment-3406653123 > > > > Regards > > > > Antoine. > > > > > > Le 10/12/2025 à 02:02, Micah Kornfield a écrit : > > > Hi Parquet Dev, > > > I put a draft PR together [1] to refactor the implementation status > page > > > [2] to use JSON as data layer and render it using hugo code. > > > > > > My rationale for doing this: > > > 1. I think in the long run it will make it easier to review and make > > small > > > updates from engines (mostly be comparing new/updates rows of JSON > for a > > > single engine). Adding new data for a new engine would only have to > > touch a > > > file specific to that engine after it is first registered. > > > 2. I think it is a good idea to start collecting more metadata (in > > > particular version number/release date) for implementations. I think > > > displaying/tracking this might get tricky without a more structured > > > approach. In particular I think having different pivots on > > features/dates > > > would be useful. > > > > > > Developing locally, the current iteration of the change generates > > > effectively the same visible content. I also tried to add hyperlinks > from > > > notes to the actual note (at least on my browser these seem a bit weird > > as > > > they scroll just slightly past the note). > > > > > > Any objections to proceeding with this type of change? > > > > > > Thanks, > > > Micah > > > > > > [1] https://github.com/apache/parquet-site/pull/143 > > > [2] https://parquet.apache.org/docs/file-format/implementationstatus/ > > > > > > > > > > > >
