Hi, I share the same feeling with Antoine that parquet-cpp seems to be fully governed by Apache Arrow PMC, not the Apache Parquet PMC. I have once discussed this with Xinli and he told me that the contribution to parquet-cpp is no longer considered when promoting committers to Apache Parquet PMC.
Best, Gang On Thu, May 16, 2024 at 4:29 PM Antoine Pitrou <[email protected]> wrote: > On Thu, 16 May 2024 10:08:42 +0200 > "Uwe L. Korn" <[email protected]> wrote: > > On Tue, May 14, 2024, at 6:30 PM, Antoine Pitrou wrote: > > > AFAIK, the only Parquet implementation under the Apache Parquet project > > > is parquet-mr :-) > > > > This is not true. The parquet-cpp that resides in the arrow repository > is still controlled by the Apache Parquet PMC. Back then, we only merged > the codebases but kept control of it with the Apache Parquet project. I > know, it is hard to understand, but at least I have never seen a vote that > would move it out of the Apache Parquet's project "control". > > Ahah. Unfortunately, this doesn't match actual community practices. For > example, when it is decided to give (Arrow) commit rights to a frequent > Parquet C++ contributor, that decision is made among the Arrow PMC, not > the Parquet PMC. > > Perhaps there would be value in aligning the legal situation on the > _de facto_ situation? > > Regards > > Antoine. > > > > > > Best > > Uwe > > > > > > > > > On Tue, 14 May 2024 10:58:58 +0200 > > > Rok Mihevc <[email protected]> wrote: > > >> Second Raphael's point. > > >> Would it be reasonable to say specification change requires > implementation > > >> in two parquet implementations within Apache Parquet project? > > >> > > >> Rok > > >> > > >> On Tue, May 14, 2024 at 10:50 AM Gang Wu < > [email protected]> wrote: > > >> > > >> > IMHO, it looks more reasonable if a reference implementation is > required > > >> > to support most (not all) elements from the specification. > > >> > > > >> > Another question is: should we discuss (and vote for) each candidate > > >> > one by one? We can start with parquet-mr which is most well-known > > >> > implementation. > > >> > > > >> > Best, > > >> > Gang > > >> > > > >> > On Tue, May 14, 2024 at 4:41 PM Raphael Taylor-Davies > > >> > <[email protected]> wrote: > > >> > > > >> > > Potentially it would be helpful to flip the question around. As > Andrew > > >> > > articulates, a reference implementation is required to implement > all > > >> > > elements from the specification, and therefore the major > consequence of > > >> > > labeling parquet-mr thusly would be that any specification change > would > > >> > > have to be implemented within parquet-mr as part of the > standardisation > > >> > > process. It would be insufficient for it to be implemented in, for > > >> > > example, two of the parquet implementations maintained by the > arrow > > >> > > project. I personally think that would be a shame and likely > exclude > > >> > > many people who would otherwise be interested in evolving the > parquet > > >> > > specification, but think that is at the core of this question. > > >> > > > > >> > > Kind Regards, > > >> > > > > >> > > Raphael > > >> > > > > >> > > On 13/05/2024 20:55, Andrew Lamb wrote: > > >> > > > Question: Should we label parquet-mr or any other parquet > > >> > implementations > > >> > > > "reference" implications"? > > >> > > > > > >> > > > This came up as part of Vinoo's great PR to list different > parquet > > >> > > > reference implementations[1][2]. > > >> > > > > > >> > > > The term "reference implementation" often has an official > connotation. > > >> > > For > > >> > > > example the wikipedia definition is "a program that implements > all > > >> > > > requirements from a corresponding specification. The reference > > >> > > > implementation ... should be considered the "correct" behavior > of any > > >> > > other > > >> > > > implementation of it."[3] > > >> > > > > > >> > > > Given the close association of parquet-mr to the parquet > standard, it > > >> > is > > >> > > a > > >> > > > natural candidate to label as "reference implementation." > However, it > > >> > is > > >> > > > not clear to me if there is consensus that it should be thusly > labeled. > > >> > > > > > >> > > > I have a strong opinion that a consensus on this question would > be very > > >> > > > helpful. I don't actually have a strong opinion about the answer > > >> > > > > > >> > > > Andrew > > >> > > > > > >> > > > > > >> > > > > > >> > > > [1]: > > >> > > > https://github.com/apache/parquet-site/pull/53#discussion_r1582882267 > > >> > > > [2]: > > >> > > > https://github.com/apache/parquet-site/pull/53#discussion_r1598283465 > > >> > > > [3]: https://en.wikipedia.org/wiki/Reference_implementation > > >> > > > > > >> > > > > >> > > > >> > > > > > >
