I am also OK with the proposed solution in the document.
However I think the doc itself needs one last wording change.
I have left more details in comments but here is the gist:
This effort is driven by a group of people in the community and not one
vendor in particular even if said people do sometimes work for vendors.
To reflect this, instead of saying the UUID identifies a Vendor, we should
describe it as an extension ID.
Then I'd remove all instances of the word "Vendor" and instead
refer to "Extensions" identified by this UUID.
This might not change anything to the implementation but it is important to
reflecting how the community works in the document.

Specifically:

"Vendor introduces a Flatbuffers variant of FileMetaData." => "This
extension introduces a Flatbuffers variant of FileMetaData..."

"The UUID is picked by the Vendor once and used throughout the
experiments." => "The UUID is picked for this specific extension and used
throughout the experiments."

"At some point Vendor decides that this is amazing and should be shared
with the world at large to advance Parquet. " => "At some point, the
community decides this extension is ready and proposed for inclusion."


On Mon, Jul 22, 2024 at 10:11 PM Micah Kornfield <[email protected]>
wrote:

> Hi Alkis,
> Thanks for the revision.  I'm OK with this as is, we can maybe wait a few
> more days to see if anybody else has comments and then discuss
> implementation of the extension mechanism?
>
> Cheers,
> Micah
>
> On Thu, Jul 18, 2024 at 10:22 PM Alkis Evlogimenos
> <[email protected]> wrote:
>
> > After Jul 17th's Parquet Sync feedback I have updated the extensions
> > proposal to remove the "reservation" mechanism. The updates are already
> > reflected in the document
> > <
> >
> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit
> > >
> > and
> > the PR <https://github.com/apache/parquet-format/pull/254>.
> >
> > On Fri, Jun 28, 2024 at 10:02 AM Alkis Evlogimenos <
> > [email protected]> wrote:
> >
> > > > I think we can at least have wording to encourage people doing
> > > extensions to post them publicly and as part of the "reservation"
> > mechanism
> > > post a link the repo that they are being developed in, if anyone is
> > curious.
> > >
> > > Good point. I will try to come up with something in the PR - unless you
> > > beat me to it :)
> > >
> > > On Fri, Jun 28, 2024 at 7:15 AM Micah Kornfield <[email protected]
> >
> > > wrote:
> > >
> > >> >
> > >> > 1. experimentation/prototyping is more often than not faster to
> > iterate
> > >> if
> > >> > it is closed. Allowing this model of development was a primary goal
> of
> > >> the
> > >> > design.
> > >>
> > >>
> > >> I agree there are advantages here.  I think a large amount of speed
> > comes
> > >> from not having to gain consensus in the community.
> > >>
> > >> At the end of the day, I don't think there is any mechanism here to
> > ensure
> > >> everybody works in public, but I think we can at least have wording to
> > >> encourage people doing extensions to post them publicly and as part of
> > the
> > >> "reservation" mechanism post a link the repo that they are being
> > developed
> > >> in, if anyone is curious.  I think this would be particularly useful
> if
> > >> there really is an intent for a number of organizations to experiment
> > with
> > >> new footer designs (but possibly also in others).
> > >>
> > >> Thanks,
> > >> Micah
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, Jun 26, 2024 at 9:33 AM Alkis Evlogimenos
> > >> <[email protected]> wrote:
> > >>
> > >> > Thank you for taking a look Micah.
> > >> >
> > >> > On the topic of openness there are various aspects that we have
> > >> considered.
> > >> > 1. experimentation/prototyping is more often than not faster to
> > iterate
> > >> if
> > >> > it is closed. Allowing this model of development was a primary goal
> of
> > >> the
> > >> > design.
> > >> > 2. when the design is final, keeping the design closed should have
> > some
> > >> > drawbacks. Duplicating content to support old readers puts some
> > natural
> > >> > incentive to make extensions official because at that point one can
> > drop
> > >> > the fat from the files and move on. Another aspect of the design is
> > the
> > >> > choice of a single extension field-id which makes the extension
> space
> > >> tiny.
> > >> > This in turn means that it is difficult to interop with others
> without
> > >> > breaking their extensions. Ergo the easiest path to any interop is
> to
> > >> open
> > >> > the extension.
> > >> >
> > >> > The above, while not enforcing work to happen in the open, strike
> some
> > >> > balance in between.
> > >> >
> > >> > I am open to suggestions on how to further incentivize opening
> > >> extensions.
> > >> >
> > >> > On Wed, Jun 26, 2024 at 6:04 PM Micah Kornfield <
> > [email protected]>
> > >> > wrote:
> > >> >
> > >> > > Hi Alkis,
> > >> > > I'm generally in favor of this, my main concern/question is trying
> > to
> > >> > > encourage work to be in the open.  I don't think in the long run
> it
> > is
> > >> > good
> > >> > > for users to always have proprietary extensions inside of Parquet.
> > >> > >
> > >> > > IMO, I think the next steps would be to add implementations to
> write
> > >> out
> > >> > > the footer extension points.
> > >> > >
> > >> > > Thanks,
> > >> > > Micah
> > >> > >
> > >> > > On Mon, Jun 24, 2024 at 1:24 PM Alkis Evlogimenos
> > >> > > <[email protected]> wrote:
> > >> > >
> > >> > > > The snafus are fixed. The original should work now.
> > >> > > >
> > >> > > > On Sun, 23 Jun 2024, 17:58 Alkis Evlogimenos, <
> > >> > > > [email protected]> wrote:
> > >> > > >
> > >> > > > > Due to some sharing snafus with automation, please request
> > access
> > >> to
> > >> > > > > comment. If you are just reading I've published this here:
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/e/2PACX-1vThXkhHNozn_p1ZZWF-nCzOtoP1lKmkaV4Legq2FaRiIgwyY2XC9AmKpBtpeF8jbBB4wfjmQ6UTg03k/pub
> > >> > > > >
> > >> > > > > On Fri, Jun 21, 2024 at 10:29 AM Alkis Evlogimenos <
> > >> > > > > [email protected]> wrote:
> > >> > > > >
> > >> > > > >> Hey folks.
> > >> > > > >>
> > >> > > > >> I want to move the extension PR
> > >> > > > >> <https://github.com/apache/parquet-format/pull/254> forward.
> > >> > > > >> Unfortunately the discussion was spread across the PR, other
> > >> threads
> > >> > > and
> > >> > > > >> documents making it slow to progress. To avoid further
> > >> > fragmentation I
> > >> > > > have
> > >> > > > >> put together a document
> > >> > > > >> <
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit
> > >> > > > >
> > >> > > > >> discussing the extensions mechanism in isolation. I believe
> the
> > >> > > document
> > >> > > > >> addresses all the concerns/comments from the PR and mailing
> > list
> > >> > > > >> discussions brought forward so far.
> > >> > > > >>
> > >> > > > >> I propose we continue the discussion in the document and once
> > >> > > everything
> > >> > > > >> is addressed, we finalize the PR.
> > >> > > > >>
> > >> > > > >> Thank you,
> > >> > > > >>
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to