When is it important for these legal headers to be present? Example: Does every commit have to have them? Is this a clean-up before release exercise where right before a release a script could modify every contribution and add the required headers to the textual files?
Understanding that small point could yield a small utility on the client side, like "asf-bless.sh" which could parse the tree of files and add headers where any are missing before contributors commit and generate a pull request. Brandon On Sun, Jun 13, 2021 at 4:12 AM Matt Casters <[email protected]> wrote: > You are both right of-course. I'm not sure why I didn't see it before but > we can simply add an option to add a header of choice defined in a file > somewhere to whichever file format we generate now (XML) or in the future. > Sure, it would not be visible to users of the GUI but it would solve the > conundrum. > > Op zo 13 jun. 2021 07:28 schreef Julian Hyde <[email protected]>: > > > A couple more: > > > > 5. Make the release manager responsible for adding headers to files that > > are missing them. > > 6. Use a tool such as autostyle [1] that detects problems and fixes them. > > > > Julian > > > > [1] https://github.com/autostyle/autostyle < > > https://github.com/autostyle/autostyle> > > > > > > > On Jun 12, 2021, at 7:34 PM, Hans Van Akelyen < > > [email protected]> wrote: > > > > > > The annoying part is that it needs to have the header when added to the > > > repository but outside of the repository it doesn't. > > > We currently have around 300 pipelines and 100 workflows in the > > repository > > > and we are advocating how easy it is to contribute these things to hop. > > Now > > > we would have to say, well it is easy but you need to add a header... > and > > > guess what... every time you change something you will need to add it > > again > > > because using the save button will overwrite your content. > > > > > > There are a couple of ways to solve this: > > > 1) automate it with a github actions/Jenkins > > > 2) manually add the header > > > 3) add a toggle to the gui/code that needs to be activated when you are > > > creating pipelines for the repository > > > 4) move to a binary format > > > > > > 1. Is not allowed/possible afaik, Jenkins definitely does not have > write > > > access to the code base, Github might have permission to write to a pr. > > But > > > then the question arises if it is even allowed to add a header to a > file > > > without the user confirming this. > > > > > > 2. This adds another boundary for non-developers/regular users to > > > contribute samples and integration tests, they don't care about the > > content > > > of a hpl/hwf in their eyes this is a "binary-file" that needs no > editing, > > > and surely not every time you change a minor thing. > > > We are really trying hard to convince people to contribute small things > > > like a single sample or a single test, but noticed that even the usage > of > > > github and how to create a PR can be a "hard" process that requires > > > hand-holding for our user base that consists mainly of non-developers. > > This > > > would raise the bar a bit higher making it harder for those willing to > > jump. > > > > > > 3. This might work for the core developers/contributors but will > probably > > > be forgotten by the friendly user that wants to contribute once, > meaning > > we > > > would have to point to them to add the header or do it ourselves. > > > > > > 4. No need for headers here we could even keep the current xml > structure > > > but zip the content of the hpl/hwf > > > > > > So to summarize, in the short term getting a release out shouldn't be > > hard. > > > One of us can add the header to all the files and be done with it. But > in > > > the long run this process is not sustainable. > > > > > > Cheers, > > > Hans > > > > > > On Sun, 13 Jun 2021 at 00:52, Julian Hyde <[email protected]> > > wrote: > > > > > >> I still don’t see why the discussion about Apache release policy needs > > to > > >> be connected with discussion about file formats. It’s simpler to > resolve > > >> the issue about release policy first, make the release, and come back > > and > > >> discuss file format later. > > >> > > >> Regarding release policy. When a user contributes a test case to Hop, > > that > > >> is a creative work according to copyright law. Like any contribution, > we > > >> don’t “claim copyright”; they retain copyright, but contribute under > > Apache > > >> license. And we require that text files have a header. > > >> > > >> No one is proposing adding headers to pipeline and workflow files that > > are > > >> not contributed to Hop. > > >> > > >> I find it hard to believe that adding a header to a test case will > make > > it > > >> behave differently, in the vast majority of cases. Exceptions can be > > made > > >> for the few case where it matters. > > >> > > >> Julian > > >> > > >> > > >> > > >>> On Jun 12, 2021, at 3:25 PM, Matt Casters <[email protected] > > .INVALID> > > >> wrote: > > >>> > > >>> That's really my point: it's really not as straightforward at all > like > > >> you > > >>> claimed Julian. The files are produced by the Hop GUI and that's > what > > we > > >>> want. We want to test what is actually used by our end-users, not > some > > >>> theoretical use-case which is typically handled by > > >> JUnit/Mockito/Powermock > > >>> and their ilk. It's this old-school vision that an XML file has to > be > > >>> written by hand or something like that which messes up this debate. > > >>> The .hpl/.hwf file format does not and should not include the ASF > > header > > >>> either. For our users it would be inappropriate as we can't claim > > >>> copyright on works produced by others. In other words, when some > > person > > >> or > > >>> company uses our software and creates a pipeline, we can't just claim > > >>> copyright for that file. At least that's how I see things. > > >>> > > >>> As for YAML: my dislike for it is enormous but since it wouldn't > solve > > >> the > > >>> header issue I wouldn't pick it for that reason alone since it allows > > >>> comments. Perhaps we should serialize in some binary format to get > > past > > >>> this issue. Since we'll need to continue XML serialization anyway > it's > > >>> just a question of storing the integration tests and samples in a way > > >> that > > >>> can be approved by the ASF. > > >>> > > >>> > > >>> On Sat, Jun 12, 2021 at 10:54 PM Julian Hyde <[email protected] > > > > >> wrote: > > >>> > > >>>> I don’t think the discussion about headers really forces this issue. > > >> It’s > > >>>> a technical decision and shouldn’t be rushed. > > >>>> > > >>>> Regarding the headers. It is straightforward to add headers to > > existing > > >>>> files. It is also straightforward to use a tool such as checkstyle > to > > >>>> enforce them (so, any PR that adds a .hpl file without a header will > > >> get a > > >>>> build error, which the contributor will duly fix). > > >>>> > > >>>> In my opinion, Hop should allow multiple formats. XML is rather old, > > and > > >>>> people find it difficult to read without practice. JSON is a bit > more > > >>>> modern, but has terrible support for multi-line strings and (in its > > >>>> official form) doesn’t allow comments and is strict about quoting of > > >>>> identifiers. YAML (or similar) is worth considering; its model is > > >>>> compatible with JSON, it allows comments, it has much better support > > for > > >>>> multi-line strings, and it tends to diff/merge easier than XML and > > JSON. > > >>>> > > >>>> Julian > > >>>> > > >>>> > > >>>>> On Jun 12, 2021, at 1:38 PM, Matt Casters <[email protected] > > >> .INVALID> > > >>>> wrote: > > >>>>> > > >>>>> Folks, > > >>>>> > > >>>>> It's been up in the air for quite some time now but it looks like > > we're > > >>>>> being forced by certain discussions in the release voting of > > 0.99-rc1. > > >>>> How > > >>>>> would you feel about moving to JSON for the standard file format of > > >>>>> pipelines and workflows? > > >>>>> I propose .hpj and .hwj as extensions. > > >>>>> This would push back our releases for a month or so while we > convert > > >> the > > >>>>> remaining serialization code to the new @HopMetadataProperty API > > >>>>> > > >>>>> Cheers, > > >>>>> Matt > > >>>> > > >>>> > > >>> > > >>> -- > > >>> Neo4j Chief Solutions Architect > > >>> *✉ *[email protected] > > >>> ☎ +32486972937 > > >> > > >> > > > > >
