Hi,

I share the same feeling with Antoine that parquet-cpp seems to be fully
governed by Apache Arrow PMC, not the Apache Parquet PMC. I have
once discussed this with Xinli and he told me that the contribution to
parquet-cpp is no longer considered when promoting committers to
Apache Parquet PMC.

Best,
Gang

On Thu, May 16, 2024 at 4:29 PM Antoine Pitrou <[email protected]> wrote:

> On Thu, 16 May 2024 10:08:42 +0200
> "Uwe L. Korn" <[email protected]> wrote:
> > On Tue, May 14, 2024, at 6:30 PM, Antoine Pitrou wrote:
> > > AFAIK, the only Parquet implementation under the Apache Parquet project
> > > is parquet-mr :-)
> >
> > This is not true. The parquet-cpp that resides in the arrow repository
> is still controlled by the Apache Parquet PMC. Back then, we only merged
> the codebases but kept control of it with the Apache Parquet project. I
> know, it is hard to understand, but at least I have never seen a vote that
> would move it out of the Apache Parquet's project "control".
>
> Ahah. Unfortunately, this doesn't match actual community practices. For
> example, when it is decided to give (Arrow) commit rights to a frequent
> Parquet C++ contributor, that decision is made among the Arrow PMC, not
> the Parquet PMC.
>
> Perhaps there would be value in aligning the legal situation on the
> _de facto_ situation?
>
> Regards
>
> Antoine.
>
>
> >
> > Best
> > Uwe
> > >
> > >
> > > On Tue, 14 May 2024 10:58:58 +0200
> > > Rok Mihevc <[email protected]> wrote:
> > >> Second Raphael's point.
> > >> Would it be reasonable to say specification change requires
> implementation
> > >> in two parquet implementations within Apache Parquet project?
> > >>
> > >> Rok
> > >>
> > >> On Tue, May 14, 2024 at 10:50 AM Gang Wu <
> [email protected]> wrote:
> > >>
> > >> > IMHO, it looks more reasonable if a reference implementation is
> required
> > >> > to support most (not all) elements from the specification.
> > >> >
> > >> > Another question is: should we discuss (and vote for) each candidate
> > >> > one by one? We can start with parquet-mr which is most well-known
> > >> > implementation.
> > >> >
> > >> > Best,
> > >> > Gang
> > >> >
> > >> > On Tue, May 14, 2024 at 4:41 PM Raphael Taylor-Davies
> > >> > <[email protected]> wrote:
> > >> >
> > >> > > Potentially it would be helpful to flip the question around. As
> Andrew
> > >> > > articulates, a reference implementation is required to implement
> all
> > >> > > elements from the specification, and therefore the major
> consequence of
> > >> > > labeling parquet-mr thusly would be that any specification change
> would
> > >> > > have to be implemented within parquet-mr as part of the
> standardisation
> > >> > > process. It would be insufficient for it to be implemented in, for
> > >> > > example, two of the parquet implementations maintained by the
> arrow
> > >> > > project. I personally think that would be a shame and likely
> exclude
> > >> > > many people who would otherwise be interested in evolving the
> parquet
> > >> > > specification, but think that is at the core of this question.
> > >> > >
> > >> > > Kind Regards,
> > >> > >
> > >> > > Raphael
> > >> > >
> > >> > > On 13/05/2024 20:55, Andrew Lamb wrote:
> > >> > > > Question: Should we label parquet-mr or any other parquet
> > >> > implementations
> > >> > > > "reference" implications"?
> > >> > > >
> > >> > > > This came up as part of Vinoo's great PR to list different
> parquet
> > >> > > > reference implementations[1][2].
> > >> > > >
> > >> > > > The term "reference implementation" often has an official
> connotation.
> > >> > > For
> > >> > > > example the wikipedia definition is "a program that implements
> all
> > >> > > > requirements from a corresponding specification. The reference
> > >> > > > implementation ... should be considered the "correct" behavior
> of any
> > >> > > other
> > >> > > > implementation of it."[3]
> > >> > > >
> > >> > > > Given the close association of parquet-mr to the parquet
> standard, it
> > >> > is
> > >> > > a
> > >> > > > natural candidate to label as "reference implementation."
> However, it
> > >> > is
> > >> > > > not clear to me if there is consensus that it should be thusly
> labeled.
> > >> > > >
> > >> > > > I have a strong opinion that a consensus on this question would
> be very
> > >> > > > helpful. I don't actually have a strong opinion about the answer
> > >> > > >
> > >> > > > Andrew
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > [1]:
> > >> > >
> https://github.com/apache/parquet-site/pull/53#discussion_r1582882267
> > >> > > > [2]:
> > >> > >
> https://github.com/apache/parquet-site/pull/53#discussion_r1598283465
> > >> > > > [3]:  https://en.wikipedia.org/wiki/Reference_implementation
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
>
>
>
>

Reply via email to