Hi, This is great!
What do folks think about also having a less minimal set of starters? For Java I am thinking about protobuf / autovalue. For Python maybe an opinionated setup with tox etc... Again this would just contain 'hello' world samples to get folks going. Regards Reza On Wed, 9 Feb 2022 at 13:56, Robert Burke <r...@google.com> wrote: > SGTM. > > On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <k...@apache.org> wrote: > >> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601 I >> think it will be simplest to license it under ASL2 and include a NOTICE >> file. The user will be free to "clone and go". >> >> I would bring these points back to the dev list: >> >> - ASL2 is what people expect from an ASF project, so it is "least >> surprise" >> - Dual-licensing is possible (but I think not worthwhile due to its >> impact on contributor license agreements) >> - ASL2 says "You must cause any modified files to carry prominent >> notices stating that You changed the files" which won't apply to the user's >> code and I would guess they simply won't bother with for files in the >> template. Or maybe there is a clever way to phrase the header so it is >> already good to go. >> - ASL2 says if the work includes a NOTICE file, you have to includes the >> attributions from it. The NOTICE file is required by ASF policy. We can >> easily set it up to be a noop for the user. >> >> So my overall take is that we should go ahead with ASL2 and a simple >> NOTICE file. Check the Jira for details. >> >> Kenn >> >> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <k...@apache.org> wrote: >> >>> And I've created the repos just now. >>> >>> Kenn >>> >>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <k...@apache.org> wrote: >>> >>>> Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601 >>>> >>>> Kenn >>>> >>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick < >>>> dannymccorm...@google.com> wrote: >>>> >>>>> Sure - I'm happy to help out with the Actions setup (and/or with the >>>>> Go template). I will say though, the Actions config should be pretty darn >>>>> simple for these examples - >>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml >>>>> seems right, for each language configuration we're targeting we basically >>>>> just want a job with: >>>>> >>>>> - checkout >>>>> - setup-<language> >>>>> - inlined script to run tests >>>>> >>>>> Always happy to help with or consult on any actions issues 🙂 >>>>> >>>>> Thanks, >>>>> Danny >>>>> >>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <kerr...@google.com> >>>>> wrote: >>>>> >>>>>> Danny has extensive experience with GitHub actions, and may be able >>>>>> to help out. >>>>>> Kerry >>>>>> >>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <k...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> I'm convinced on all points. My main motivation was to keep it >>>>>>> simple. But of course we should keep it simple for users, not us :-) >>>>>>> >>>>>>> I can take on the task of asking about MIT license and requesting >>>>>>> the repos be created. Not sure if it needs my level of privileges but >>>>>>> I'm >>>>>>> happy to do it anyhow. >>>>>>> >>>>>>> Kenn >>>>>>> >>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <rober...@google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dcava...@google.com> >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > MIT is much more permissive, but I also don't have any problems >>>>>>>> changing it to Apache license. In any case, how about we create the >>>>>>>> following repos? >>>>>>>> >>>>>>>> For these starter projects, we don't want to encumber any users of >>>>>>>> these templates with any particular licensing requirements (right?) >>>>>>>> and we don't even care about attribution. We want these to be pretty >>>>>>>> much as close to public domain as possible. That's not what the >>>>>>>> Apache >>>>>>>> licence does. (If it's even relevant, a good argument could likely >>>>>>>> be >>>>>>>> made for de minis or fair use, but I think it's best to be explicit >>>>>>>> about this. Perhaps this'd be a good question for apache legal? >>>>>>>> >>>>>>>> > apache/beam-starter-java >>>>>>>> > apache/beam-starter-python >>>>>>>> > apache/beam-starter-go >>>>>>>> > apache/beam-starter-kotlin >>>>>>>> > apache/beam-starter-scala >>>>>>>> > >>>>>>>> > We'll start by populating the Java one which is the most pressing >>>>>>>> one and the one that is ready, but the rest should be simpler. >>>>>>>> > >>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects for >>>>>>>> every language. Once we have Java, Python and Go, it might be a good >>>>>>>> idea >>>>>>>> to change the quickstarts to use these instead of the word count. >>>>>>>> There is >>>>>>>> already a dedicated word count walkthrough so I think that is already >>>>>>>> covered. >>>>>>>> > >>>>>>>> > If we all agree on the repo names, who can help us create them? >>>>>>>> > >>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw < >>>>>>>> rober...@google.com> wrote: >>>>>>>> >> >>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <k...@apache.org> >>>>>>>> wrote: >>>>>>>> >> > >>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part of >>>>>>>> it. >>>>>>>> >> > >>>>>>>> >> > But also the answer to "I simply don't know what one would put >>>>>>>> in a Python repo than, other than a bare setup.py that lists a >>>>>>>> dependency >>>>>>>> on apache_beam" is answered by David's initial email and his repo, >>>>>>>> namely: >>>>>>>> >> > >>>>>>>> >> > - GitHub Actions configuration >>>>>>>> >> > - README.md >>>>>>>> >> > - example that already runs >>>>>>>> >> >>>>>>>> >> OK, fair enough. >>>>>>>> >> >>>>>>>> >> > - LICENSE (notably you've got it as MIT but to be part of >>>>>>>> Apache software it needs to be ASL2) >>>>>>>> >> >>>>>>>> >> On the topic of licence, it's a bit tricky because one doesn't >>>>>>>> want to >>>>>>>> >> bind the users of such a template as being a derivative work of a >>>>>>>> >> too-restrictive licence. The licence of the template itself >>>>>>>> should >>>>>>>> >> generally be very permissive. >>>>>>>> >> >>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> >>>>>>>> wrote: >>>>>>>> >> >> >>>>>>>> >> >> I think for consistency it makes sense to users to be told to >>>>>>>> checkout this git repo for the language of your choice and run. Some >>>>>>>> repos >>>>>>>> will have more/less than others when it comes to setup necessary. >>>>>>>> >> >> >>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw < >>>>>>>> rober...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>> >> >>> +1 for doing this for Java, as setting up a project there is >>>>>>>> quite >>>>>>>> >> >>> complicated. I simply don't know what one would put in a >>>>>>>> Python repo >>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency on >>>>>>>> >> >>> apache_beam. We don't have recommendations on file layout, >>>>>>>> etc. more >>>>>>>> >> >>> than that (though there's plenty of generic advice to be >>>>>>>> found out >>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and >>>>>>>> javascript >>>>>>>> >> >>> would be as well (npm install apache-beam and your >>>>>>>> package.json file >>>>>>>> >> >>> gets updated). >>>>>>>> >> >>> >>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> >>>>>>>> wrote: >>>>>>>> >> >>> > >>>>>>>> >> >>> > There are several examples already within the Beam repo >>>>>>>> found in: >>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples >>>>>>>> >> >>> > >>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples >>>>>>>> >> >>> > >>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples >>>>>>>> >> >>> > >>>>>>>> >> >>> > >>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal < >>>>>>>> sachi...@google.com> wrote: >>>>>>>> >> >>> >> >>>>>>>> >> >>> >> I'd love to do something other than Wordcount just for >>>>>>>> novelty/freshness but agreed with the suggestion that having an >>>>>>>> example in >>>>>>>> each quickstart would be ideal. >>>>>>>> >> >>> >> >>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger < >>>>>>>> dhuntsper...@google.com> wrote: >>>>>>>> >> >>> >>> >>>>>>>> >> >>> >>> + 1 to a separate repo for each language. >>>>>>>> >> >>> >>> >>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example in >>>>>>>> each repo? I know that makes the repos less minimal, but we could >>>>>>>> rewrite >>>>>>>> the quickstarts around these repos instead of the current Wordcount >>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the >>>>>>>> quickstarts... >>>>>>>> >> >>> >>> >>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos < >>>>>>>> dcava...@google.com> wrote: >>>>>>>> >> >>> >>>> >>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance >>>>>>>> is preferable, and the github repos are more flexible and maintainable. >>>>>>>> >> >>> >>>> >>>>>>>> >> >>> >>>> How about we create: >>>>>>>> >> >>> >>>> >>>>>>>> >> >>> >>>> apache/beam-starter-java >>>>>>>> >> >>> >>>> apache/beam-starter-python >>>>>>>> >> >>> >>>> apache/beam-starter-go >>>>>>>> >> >>> >>>> >>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer >>>>>>>> having repos for all languages. It makes sense for consistency as well. >>>>>>>> >> >>> >>>> >>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik < >>>>>>>> lc...@google.com> wrote: >>>>>>>> >> >>> >>>>> >>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out a >>>>>>>> specific version of the examples that coincides with a specific SDK >>>>>>>> version >>>>>>>> then we could drop the archetypes. >>>>>>>> >> >>> >>>>> >>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette < >>>>>>>> bhule...@google.com> wrote: >>>>>>>> >> >>> >>>>>> >>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to >>>>>>>> break commonly, but I think it would be good to make sure tests aren't >>>>>>>> failing when a release is published. >>>>>>>> >> >>> >>>>>> >>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a >>>>>>>> breakage after the release. Agree we should verify RCs (document as >>>>>>>> part of >>>>>>>> the release process), or even better, add automation to verify the repo >>>>>>>> against snapshots. The automation could be nice to have anyway since it >>>>>>>> provides an example for users to follow if they want to test against >>>>>>>> snapshots and report issues to us sooner. >>>>>>>> >> >>> >>>>>> >>>>>>>> >> >>> >>>>>> >>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the >>>>>>>> archetype? >>>>>>>> >> >>> >>>>>> >>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik < >>>>>>>> lc...@google.com> wrote: >>>>>>>> >> >>> >>>>>>> >>>>>>>> >> >>> >>>>>>> Sounds reasonable. >>>>>>>> >> >>> >>>>>>> >>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos < >>>>>>>> dcava...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>> >>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since >>>>>>>> we can see how a true minimal project looks like. Having it in the main >>>>>>>> repo would inherit build file configurations and other settings that >>>>>>>> would >>>>>>>> be different from a clean project, so it could be non-trivial to adapt. >>>>>>>> Also as its own repo, it's easier to clone and modify, or create an >>>>>>>> instance of the template. >>>>>>>> >> >>> >>>>>>>> >>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam >>>>>>>> version and other dependencies automatically. Testing is already set >>>>>>>> up via >>>>>>>> GitHub actions for every pull request, so it would automatically be >>>>>>>> tested >>>>>>>> as soon as there is a new dependency version available. >>>>>>>> >> >>> >>>>>>>> >>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to >>>>>>>> break commonly, but I think it would be good to make sure tests aren't >>>>>>>> failing when a release is published. >>>>>>>> >> >>> >>>>>>>> >>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and >>>>>>>> having all the build systems we want to support for them. As long as we >>>>>>>> document which files are for which build system. That way there are >>>>>>>> less >>>>>>>> repos to maintain. >>>>>>>> >> >>> >>>>>>>> >>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik < >>>>>>>> lc...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>>> >>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then >>>>>>>> the archetypes but the archetypes have a few conveniences since they >>>>>>>> are >>>>>>>> integrated with apache/beam repo. For example, updates/testing are >>>>>>>> done at >>>>>>>> the same time a corresponding change to the main repo is done (like >>>>>>>> library >>>>>>>> version updates), they are released when the SDK is released. >>>>>>>> >> >>> >>>>>>>>> >>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single >>>>>>>> starter repo containing all the starters or one per language or one per >>>>>>>> build system? >>>>>>>> >> >>> >>>>>>>>> >>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen? >>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g. >>>>>>>> release manager owns it)? >>>>>>>> >> >>> >>>>>>>>> >>>>>>>> >> >>> >>>>>>>>> >>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos < >>>>>>>> dcava...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that >>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub >>>>>>>> template >>>>>>>> might be the more flexible option, and we could have something similar >>>>>>>> for >>>>>>>> other languages as well. Having said that, we could still create a >>>>>>>> Maven >>>>>>>> archetype. If someone is familiar with that process, please let me know >>>>>>>> since I'm not too familiar with Maven and its ecosystem. >>>>>>>> >> >>> >>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to >>>>>>>> pin down the name of the repo, create it, and move the code there. I >>>>>>>> was >>>>>>>> thinking either `apache/beam-java-template` or >>>>>>>> `apache/beam-java-starter`. >>>>>>>> What do you think? >>>>>>>> >> >>> >>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo? >>>>>>>> >> >>> >>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay < >>>>>>>> al...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on >>>>>>>> this? Do you need help? >>>>>>>> >> >>> >>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette < >>>>>>>> bhule...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks! >>>>>>>> >> >>> >>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already, >>>>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart >>>>>>>> [2]. >>>>>>>> Could we de-dupe these (e.g. reference the GitHub template in the >>>>>>>> quickstart, or co-locate the archetype with the GitHub template)? >>>>>>>> >> >>> >>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put >>>>>>>> this somewhere like apache/beam-java-template? I think apache >>>>>>>> repositories >>>>>>>> like beam-* are allowed. >>>>>>>> >> >>> >>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>> Brian >>>>>>>> >> >>> >>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>> [1] >>>>>>>> https://maven.apache.org/archetype/index.html >>>>>>>> >> >>> >>>>>>>>>>>> [2] >>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code >>>>>>>> >> >>> >>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos < >>>>>>>> dcava...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay >>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev >>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles >>>>>>>> >> >>> >>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else! >>>>>>>> >> >>> >>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos >>>>>>>> <dcava...@google.com> wrote: >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community! >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java >>>>>>>> project, I've been working on a GitHub template containing a minimal >>>>>>>> Beam >>>>>>>> Java pipeline for people to start with. >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template: >>>>>>>> https://github.com/davidcavazos/beam-java >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains: >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline >>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file >>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven >>>>>>>> (Direct runner) >>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions >>>>>>>> (around 1-2 minutes to run) >>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, >>>>>>>> run, test, and add other runners >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a >>>>>>>> template. >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps >>>>>>>> >> >>> >>>>>>>>>>>>>> >>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy >>>>>>>> with it 🙂 >>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub >>>>>>>> account, so we need to create an Apache repo to host it >>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how >>>>>>>> to create a new Beam Java pipeline >>>>>>>> >>>>>>>