As part of this track I wrote up two draft PRs for what I think might be a
workable release process for new features and giving concrete guidance on
when they should be enabled by default in other implementations:

https://github.com/apache/parquet-format/pull/258
https://github.com/apache/parquet-site/pull/61

I think having a process around this is critical to avoid the confusion
we've had over V2.

Comments/Feedback appreciated.

Thanks,
Micah


On Mon, May 27, 2024 at 10:46 PM Micah Kornfield <[email protected]>
wrote:

> As a follow-up to the "V3" Discussions [1][2] I wanted to start a
> discussion to see who is interested in improving Parquet infrastructure.
> In particular, as we consider newer features, I think we should be
> considering regular major version releases, to allow for new features to
> become default.
>
> There are a few areas that we need volunteers for, so it would be good to
> get a sense of who is willing to help out.
>
> 1.  Is anyone who isn't already involved in the release process willing to
> volunteer to do parquet-java releases on a regular basis? I believe the
> requirement is being a committer/PMC member on Parquet but might be
> mistaken.  Personally, given my current commitments, I think I can help
> drive 1 Parquet-java release a year. I think once we can verify we have
> enough people we can try to formalize a new release policy with major
> version bumps to help ensure any work done on the other tracks will someday
> become defaults for consumers.
>
> 2.  Is anybody interested in looking more deeply into developing
> integration tests between the different Parquet implementations and major
> down-stream consumers of Parquet?  I believe Apache arrow has a pretty good
> model [3][4] in a lot of respects with cross-language integration tests,
> and nightly (via crossbow) integration tests with other consumers, but
> there are a wide variety of things that would improve the current state.
> One other possible concern is the amount of CI resources this might
> consume, and if we will need contributions to fund it.
>
> 3.  I believe someone (maybe Ed) already mentioned they are working on a
> full feature matrix for different parquet implementations but this was also
> called out as critical.  If no-one else is interested, I can also start
> putting something together here.
>
> Anything else people want to bring up in the discussion?
>
> Thanks,
> Micah
>
> [1] https://lists.apache.org/thread/5jyhzkwyrjk9z52g0b49g31ygnz73gxo
> [2]
> https://docs.google.com/document/d/19hQLYcU5_r5nJB7GtnjfODLlSDiNS24GXAtKg9b0_ls/edit
> [3]
> https://arrow.apache.org/docs/format/Integration.html#integration-testing
> [4]  https://github.com/ursacomputing/crossbow
>

Reply via email to