Hi Alkis, I saw you addressed and resolved the comments in the doc. Thank you. This looks good to me. I would recommend others that have been active in this conversation to take a final look. Best Julien
On Tue, Jul 23, 2024 at 3:06 PM Julien Le Dem <[email protected]> wrote: > I am also OK with the proposed solution in the document. > However I think the doc itself needs one last wording change. > I have left more details in comments but here is the gist: > This effort is driven by a group of people in the community and not one > vendor in particular even if said people do sometimes work for vendors. > To reflect this, instead of saying the UUID identifies a Vendor, we should > describe it as an extension ID. > Then I'd remove all instances of the word "Vendor" and instead > refer to "Extensions" identified by this UUID. > This might not change anything to the implementation but it is important > to reflecting how the community works in the document. > > Specifically: > > "Vendor introduces a Flatbuffers variant of FileMetaData." => "This > extension introduces a Flatbuffers variant of FileMetaData..." > > "The UUID is picked by the Vendor once and used throughout the > experiments." => "The UUID is picked for this specific extension and used > throughout the experiments." > > "At some point Vendor decides that this is amazing and should be shared > with the world at large to advance Parquet. " => "At some point, the > community decides this extension is ready and proposed for inclusion." > > > On Mon, Jul 22, 2024 at 10:11 PM Micah Kornfield <[email protected]> > wrote: > >> Hi Alkis, >> Thanks for the revision. I'm OK with this as is, we can maybe wait a few >> more days to see if anybody else has comments and then discuss >> implementation of the extension mechanism? >> >> Cheers, >> Micah >> >> On Thu, Jul 18, 2024 at 10:22 PM Alkis Evlogimenos >> <[email protected]> wrote: >> >> > After Jul 17th's Parquet Sync feedback I have updated the extensions >> > proposal to remove the "reservation" mechanism. The updates are already >> > reflected in the document >> > < >> > >> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit >> > > >> > and >> > the PR <https://github.com/apache/parquet-format/pull/254>. >> > >> > On Fri, Jun 28, 2024 at 10:02 AM Alkis Evlogimenos < >> > [email protected]> wrote: >> > >> > > > I think we can at least have wording to encourage people doing >> > > extensions to post them publicly and as part of the "reservation" >> > mechanism >> > > post a link the repo that they are being developed in, if anyone is >> > curious. >> > > >> > > Good point. I will try to come up with something in the PR - unless >> you >> > > beat me to it :) >> > > >> > > On Fri, Jun 28, 2024 at 7:15 AM Micah Kornfield < >> [email protected]> >> > > wrote: >> > > >> > >> > >> > >> > 1. experimentation/prototyping is more often than not faster to >> > iterate >> > >> if >> > >> > it is closed. Allowing this model of development was a primary >> goal of >> > >> the >> > >> > design. >> > >> >> > >> >> > >> I agree there are advantages here. I think a large amount of speed >> > comes >> > >> from not having to gain consensus in the community. >> > >> >> > >> At the end of the day, I don't think there is any mechanism here to >> > ensure >> > >> everybody works in public, but I think we can at least have wording >> to >> > >> encourage people doing extensions to post them publicly and as part >> of >> > the >> > >> "reservation" mechanism post a link the repo that they are being >> > developed >> > >> in, if anyone is curious. I think this would be particularly useful >> if >> > >> there really is an intent for a number of organizations to experiment >> > with >> > >> new footer designs (but possibly also in others). >> > >> >> > >> Thanks, >> > >> Micah >> > >> >> > >> >> > >> >> > >> >> > >> On Wed, Jun 26, 2024 at 9:33 AM Alkis Evlogimenos >> > >> <[email protected]> wrote: >> > >> >> > >> > Thank you for taking a look Micah. >> > >> > >> > >> > On the topic of openness there are various aspects that we have >> > >> considered. >> > >> > 1. experimentation/prototyping is more often than not faster to >> > iterate >> > >> if >> > >> > it is closed. Allowing this model of development was a primary >> goal of >> > >> the >> > >> > design. >> > >> > 2. when the design is final, keeping the design closed should have >> > some >> > >> > drawbacks. Duplicating content to support old readers puts some >> > natural >> > >> > incentive to make extensions official because at that point one can >> > drop >> > >> > the fat from the files and move on. Another aspect of the design is >> > the >> > >> > choice of a single extension field-id which makes the extension >> space >> > >> tiny. >> > >> > This in turn means that it is difficult to interop with others >> without >> > >> > breaking their extensions. Ergo the easiest path to any interop is >> to >> > >> open >> > >> > the extension. >> > >> > >> > >> > The above, while not enforcing work to happen in the open, strike >> some >> > >> > balance in between. >> > >> > >> > >> > I am open to suggestions on how to further incentivize opening >> > >> extensions. >> > >> > >> > >> > On Wed, Jun 26, 2024 at 6:04 PM Micah Kornfield < >> > [email protected]> >> > >> > wrote: >> > >> > >> > >> > > Hi Alkis, >> > >> > > I'm generally in favor of this, my main concern/question is >> trying >> > to >> > >> > > encourage work to be in the open. I don't think in the long run >> it >> > is >> > >> > good >> > >> > > for users to always have proprietary extensions inside of >> Parquet. >> > >> > > >> > >> > > IMO, I think the next steps would be to add implementations to >> write >> > >> out >> > >> > > the footer extension points. >> > >> > > >> > >> > > Thanks, >> > >> > > Micah >> > >> > > >> > >> > > On Mon, Jun 24, 2024 at 1:24 PM Alkis Evlogimenos >> > >> > > <[email protected]> wrote: >> > >> > > >> > >> > > > The snafus are fixed. The original should work now. >> > >> > > > >> > >> > > > On Sun, 23 Jun 2024, 17:58 Alkis Evlogimenos, < >> > >> > > > [email protected]> wrote: >> > >> > > > >> > >> > > > > Due to some sharing snafus with automation, please request >> > access >> > >> to >> > >> > > > > comment. If you are just reading I've published this here: >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://docs.google.com/document/d/e/2PACX-1vThXkhHNozn_p1ZZWF-nCzOtoP1lKmkaV4Legq2FaRiIgwyY2XC9AmKpBtpeF8jbBB4wfjmQ6UTg03k/pub >> > >> > > > > >> > >> > > > > On Fri, Jun 21, 2024 at 10:29 AM Alkis Evlogimenos < >> > >> > > > > [email protected]> wrote: >> > >> > > > > >> > >> > > > >> Hey folks. >> > >> > > > >> >> > >> > > > >> I want to move the extension PR >> > >> > > > >> <https://github.com/apache/parquet-format/pull/254> >> forward. >> > >> > > > >> Unfortunately the discussion was spread across the PR, other >> > >> threads >> > >> > > and >> > >> > > > >> documents making it slow to progress. To avoid further >> > >> > fragmentation I >> > >> > > > have >> > >> > > > >> put together a document >> > >> > > > >> < >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit >> > >> > > > > >> > >> > > > >> discussing the extensions mechanism in isolation. I believe >> the >> > >> > > document >> > >> > > > >> addresses all the concerns/comments from the PR and mailing >> > list >> > >> > > > >> discussions brought forward so far. >> > >> > > > >> >> > >> > > > >> I propose we continue the discussion in the document and >> once >> > >> > > everything >> > >> > > > >> is addressed, we finalize the PR. >> > >> > > > >> >> > >> > > > >> Thank you, >> > >> > > > >> >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > >> >
