IMHO, it looks more reasonable if a reference implementation is required to support most (not all) elements from the specification.
Another question is: should we discuss (and vote for) each candidate one by one? We can start with parquet-mr which is most well-known implementation. Best, Gang On Tue, May 14, 2024 at 4:41 PM Raphael Taylor-Davies <[email protected]> wrote: > Potentially it would be helpful to flip the question around. As Andrew > articulates, a reference implementation is required to implement all > elements from the specification, and therefore the major consequence of > labeling parquet-mr thusly would be that any specification change would > have to be implemented within parquet-mr as part of the standardisation > process. It would be insufficient for it to be implemented in, for > example, two of the parquet implementations maintained by the arrow > project. I personally think that would be a shame and likely exclude > many people who would otherwise be interested in evolving the parquet > specification, but think that is at the core of this question. > > Kind Regards, > > Raphael > > On 13/05/2024 20:55, Andrew Lamb wrote: > > Question: Should we label parquet-mr or any other parquet implementations > > "reference" implications"? > > > > This came up as part of Vinoo's great PR to list different parquet > > reference implementations[1][2]. > > > > The term "reference implementation" often has an official connotation. > For > > example the wikipedia definition is "a program that implements all > > requirements from a corresponding specification. The reference > > implementation ... should be considered the "correct" behavior of any > other > > implementation of it."[3] > > > > Given the close association of parquet-mr to the parquet standard, it is > a > > natural candidate to label as "reference implementation." However, it is > > not clear to me if there is consensus that it should be thusly labeled. > > > > I have a strong opinion that a consensus on this question would be very > > helpful. I don't actually have a strong opinion about the answer > > > > Andrew > > > > > > > > [1]: > https://github.com/apache/parquet-site/pull/53#discussion_r1582882267 > > [2]: > https://github.com/apache/parquet-site/pull/53#discussion_r1598283465 > > [3]: https://en.wikipedia.org/wiki/Reference_implementation > > >
