I also really like the idea of a data driven page, thank you for doing this

I agree with Antoine  that YAML (with comments) would be preferable than
json

I also personally slightly prefer using hugo templates rather than a new
python script to reduce the number of technologies required to build
parquet-site as well as the reasons Micah mentions.

Andrew

On Wed, Dec 10, 2025 at 12:58 PM Micah Kornfield <[email protected]>
wrote:

> Hi Antoine,
>
> Thanks for the feedback.
>
>
> > 1) JSON is annoying to edit manually and doesn't support comments;
>
>
> Happy to switch to YAML.
>
>  2) ingesting and processing data from a HTML templating
> > language is cumbersome)
>
>
> Ingesting is actually not the problem.  I agree processing is a little bit
> cumbersome but at this point, I'd rather keep this in Hugo for the
> following reasons:
>
> 1.  Python or even plain Go code might be preferable, but there is no way
> to do this within the Hugo framework.  This means there is a trade-off in
> developer/CI complexity for keeping a separate script up-to-date (right
> now, you can modify the template and Hugo will re-render automatically,
> making for a tight dev-loop.
> 2.  At least at the moment, I don't anticipate a lot of changes needed to
> this template, and if we do find maintenance cumbersome it shouldn't be
> heavy lift to migrate to python.
>
> As a path forward, let me see what changes are needed to migrate to YAML,
> if these require a lot of changes to the current Hugo template, I can
> rewrite it in python. Otherwise, I'd prefer we do the migration when we
> observe it is needed.
>
> Cheers,
> Micah
>
> On Wed, Dec 10, 2025 at 2:38 AM Antoine Pitrou <[email protected]> wrote:
>
> >
> > Hi Micah,
> >
> > Thanks a lot for taking the time to do this.
> >
> > I think this is a good idea on the principle. However, maintenance-wise,
> > I think it might be more future-proof to start with YAML files and have
> > the rendering done by a Python script.
> >
> > (rationale: 1) JSON is annoying to edit manually and doesn't support
> > comments; 2) ingesting and processing data from a HTML templating
> > language is cumbersome)
> >
> > As for point 2 below, I might also remind people of the idea I proposed
> > some while ago on the issue tracker, namely to define calendar-based
> > "presets" based on feature availability in the ~3 main open source
> > Parquet implementations:
> >
> https://github.com/apache/parquet-format/issues/384#issuecomment-3406653123
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 10/12/2025 à 02:02, Micah Kornfield a écrit :
> > > Hi Parquet Dev,
> > > I put a draft PR together [1] to refactor the implementation status
> page
> > > [2] to use JSON as data layer and render it using hugo code.
> > >
> > > My rationale for doing this:
> > > 1.  I think in the long run it will make it easier to review and make
> > small
> > > updates from engines  (mostly be comparing new/updates rows of JSON
> for a
> > > single engine). Adding new data for a new engine would only have to
> > touch a
> > > file specific to that engine after it is first registered.
> > > 2.  I think it is a good idea to start collecting more metadata (in
> > > particular version number/release date) for implementations.  I think
> > > displaying/tracking this might get tricky without a more structured
> > > approach.  In particular I think having different pivots on
> > features/dates
> > > would be useful.
> > >
> > > Developing locally, the current iteration of the change generates
> > > effectively the same visible content. I also tried to add hyperlinks
> from
> > > notes to the actual note (at least on my browser these seem a bit weird
> > as
> > > they scroll just slightly past the note).
> > >
> > > Any objections to proceeding with this type of change?
> > >
> > > Thanks,
> > > Micah
> > >
> > > [1] https://github.com/apache/parquet-site/pull/143
> > > [2] https://parquet.apache.org/docs/file-format/implementationstatus/
> > >
> >
> >
> >
> >
>

Reply via email to