You are both right of-course. I'm not sure why I didn't see it before but we can simply add an option to add a header of choice defined in a file somewhere to whichever file format we generate now (XML) or in the future. Sure, it would not be visible to users of the GUI but it would solve the conundrum.
Op zo 13 jun. 2021 07:28 schreef Julian Hyde <[email protected]>: > A couple more: > > 5. Make the release manager responsible for adding headers to files that > are missing them. > 6. Use a tool such as autostyle [1] that detects problems and fixes them. > > Julian > > [1] https://github.com/autostyle/autostyle < > https://github.com/autostyle/autostyle> > > > > On Jun 12, 2021, at 7:34 PM, Hans Van Akelyen < > [email protected]> wrote: > > > > The annoying part is that it needs to have the header when added to the > > repository but outside of the repository it doesn't. > > We currently have around 300 pipelines and 100 workflows in the > repository > > and we are advocating how easy it is to contribute these things to hop. > Now > > we would have to say, well it is easy but you need to add a header... and > > guess what... every time you change something you will need to add it > again > > because using the save button will overwrite your content. > > > > There are a couple of ways to solve this: > > 1) automate it with a github actions/Jenkins > > 2) manually add the header > > 3) add a toggle to the gui/code that needs to be activated when you are > > creating pipelines for the repository > > 4) move to a binary format > > > > 1. Is not allowed/possible afaik, Jenkins definitely does not have write > > access to the code base, Github might have permission to write to a pr. > But > > then the question arises if it is even allowed to add a header to a file > > without the user confirming this. > > > > 2. This adds another boundary for non-developers/regular users to > > contribute samples and integration tests, they don't care about the > content > > of a hpl/hwf in their eyes this is a "binary-file" that needs no editing, > > and surely not every time you change a minor thing. > > We are really trying hard to convince people to contribute small things > > like a single sample or a single test, but noticed that even the usage of > > github and how to create a PR can be a "hard" process that requires > > hand-holding for our user base that consists mainly of non-developers. > This > > would raise the bar a bit higher making it harder for those willing to > jump. > > > > 3. This might work for the core developers/contributors but will probably > > be forgotten by the friendly user that wants to contribute once, meaning > we > > would have to point to them to add the header or do it ourselves. > > > > 4. No need for headers here we could even keep the current xml structure > > but zip the content of the hpl/hwf > > > > So to summarize, in the short term getting a release out shouldn't be > hard. > > One of us can add the header to all the files and be done with it. But in > > the long run this process is not sustainable. > > > > Cheers, > > Hans > > > > On Sun, 13 Jun 2021 at 00:52, Julian Hyde <[email protected]> > wrote: > > > >> I still don’t see why the discussion about Apache release policy needs > to > >> be connected with discussion about file formats. It’s simpler to resolve > >> the issue about release policy first, make the release, and come back > and > >> discuss file format later. > >> > >> Regarding release policy. When a user contributes a test case to Hop, > that > >> is a creative work according to copyright law. Like any contribution, we > >> don’t “claim copyright”; they retain copyright, but contribute under > Apache > >> license. And we require that text files have a header. > >> > >> No one is proposing adding headers to pipeline and workflow files that > are > >> not contributed to Hop. > >> > >> I find it hard to believe that adding a header to a test case will make > it > >> behave differently, in the vast majority of cases. Exceptions can be > made > >> for the few case where it matters. > >> > >> Julian > >> > >> > >> > >>> On Jun 12, 2021, at 3:25 PM, Matt Casters <[email protected] > .INVALID> > >> wrote: > >>> > >>> That's really my point: it's really not as straightforward at all like > >> you > >>> claimed Julian. The files are produced by the Hop GUI and that's what > we > >>> want. We want to test what is actually used by our end-users, not some > >>> theoretical use-case which is typically handled by > >> JUnit/Mockito/Powermock > >>> and their ilk. It's this old-school vision that an XML file has to be > >>> written by hand or something like that which messes up this debate. > >>> The .hpl/.hwf file format does not and should not include the ASF > header > >>> either. For our users it would be inappropriate as we can't claim > >>> copyright on works produced by others. In other words, when some > person > >> or > >>> company uses our software and creates a pipeline, we can't just claim > >>> copyright for that file. At least that's how I see things. > >>> > >>> As for YAML: my dislike for it is enormous but since it wouldn't solve > >> the > >>> header issue I wouldn't pick it for that reason alone since it allows > >>> comments. Perhaps we should serialize in some binary format to get > past > >>> this issue. Since we'll need to continue XML serialization anyway it's > >>> just a question of storing the integration tests and samples in a way > >> that > >>> can be approved by the ASF. > >>> > >>> > >>> On Sat, Jun 12, 2021 at 10:54 PM Julian Hyde <[email protected]> > >> wrote: > >>> > >>>> I don’t think the discussion about headers really forces this issue. > >> It’s > >>>> a technical decision and shouldn’t be rushed. > >>>> > >>>> Regarding the headers. It is straightforward to add headers to > existing > >>>> files. It is also straightforward to use a tool such as checkstyle to > >>>> enforce them (so, any PR that adds a .hpl file without a header will > >> get a > >>>> build error, which the contributor will duly fix). > >>>> > >>>> In my opinion, Hop should allow multiple formats. XML is rather old, > and > >>>> people find it difficult to read without practice. JSON is a bit more > >>>> modern, but has terrible support for multi-line strings and (in its > >>>> official form) doesn’t allow comments and is strict about quoting of > >>>> identifiers. YAML (or similar) is worth considering; its model is > >>>> compatible with JSON, it allows comments, it has much better support > for > >>>> multi-line strings, and it tends to diff/merge easier than XML and > JSON. > >>>> > >>>> Julian > >>>> > >>>> > >>>>> On Jun 12, 2021, at 1:38 PM, Matt Casters <[email protected] > >> .INVALID> > >>>> wrote: > >>>>> > >>>>> Folks, > >>>>> > >>>>> It's been up in the air for quite some time now but it looks like > we're > >>>>> being forced by certain discussions in the release voting of > 0.99-rc1. > >>>> How > >>>>> would you feel about moving to JSON for the standard file format of > >>>>> pipelines and workflows? > >>>>> I propose .hpj and .hwj as extensions. > >>>>> This would push back our releases for a month or so while we convert > >> the > >>>>> remaining serialization code to the new @HopMetadataProperty API > >>>>> > >>>>> Cheers, > >>>>> Matt > >>>> > >>>> > >>> > >>> -- > >>> Neo4j Chief Solutions Architect > >>> *✉ *[email protected] > >>> ☎ +32486972937 > >> > >> > >
