+1

On Tue, Mar 10, 2020 at 12:59 AM Alex Van Boxel <[email protected]> wrote:

> One last thing, for any runner after this one... wouldn't it be a good
> acceptance criteria to only accept portable implementations anymore?
>
>  _/
> _/ Alex Van Boxel
>
>
> On Mon, Mar 9, 2020 at 10:42 PM Ismaël Mejía <[email protected]> wrote:
>
>> Good points Kenn. I think we mostly agree on what has been discussed in
>> this
>> thread the pros/cons of having runners on our repository, but this is
>> probably
>> not the best moment in time to change any policy in that aspect.
>>
>> So if nobody objects I think we can proceed. I am OOO this week so with
>> less
>> time to continue with the code review, but I will be back to finish the
>> review
>> and hopefully finally get this merged with Pulasthi next week (sorry for
>> the
>> delay).
>>
>> > (don't wait for me on code review - if Ismaël said it is good, then it
>> is
>> > good.)
>>
>> Thanks for your confidence. Twister2 runners looks good so far, but I will
>> confirm 100% next week :) In the meantime if someone has some extra
>> cycles to
>> take a look extra feedback is always welcome.
>>
>> On Mon, Mar 9, 2020 at 5:50 AM Kenneth Knowles <[email protected]> wrote:
>> >
>> > I haven't heard anyone suggest that we need a vote. I haven't heard
>> anyone object to this being merged to master. Some time ago, we mostly
>> decided to favor master instead of branches, because it is so much smoother
>> for contributors and users.
>> >
>> > So I am poking this thread one last time and otherwise I would consider
>> it consensus that once code review is done the runner is a part of Beam
>> (experimental!).
>> >
>> > (don't wait for me on code review - if Ismaël said it is good, then it
>> is good.)
>> >
>> > Kenn
>> >
>> > On Fri, Mar 6, 2020 at 7:47 AM Pulasthi Supun Wickramasinghe <
>> [email protected]> wrote:
>> >>
>> >> I understand that the discussion is on a more broad level than the
>> Twister2 runner. From my experience developing the runner the main
>> advantage of being inside the beam project was the easy access to the wide
>> range of tests and other core/utility code as Kyle pointed out. Unmerging
>> runners that are not properly maintained and updated would be the most
>> logical path to follow since the internals of the runners are only well
>> understood by developers of that particular project. It would be
>> unreasonable to expect the Beam community to maintain them. And since the
>> runners do not alter the core API's I assume they would be easy to unmerge
>> if the need arises.
>> >>
>> >> Talking specifically about Twister2 runner, we hope to continue
>> developing the runner in the future to add both streaming capability and
>> develop a portable runner as well. The team behind Twister2 is working
>> towards the goal to get the project into Apache Incubator in the near
>> future (Hopefully to submit the proposal in the next couple of months).
>> >>
>> >> Best Regards,
>> >> Pulasthi
>> >>
>> >>
>> >>
>> >> On Thu, Mar 5, 2020 at 6:56 PM Robert Bradshaw <[email protected]>
>> wrote:
>> >>>
>> >>> I think we will get to a point where it makes sense for runners to
>> >>> live in their own repositories, with their own release cadence, but
>> >>> we're not at that point yet. One prerequisite is a stable API--we're
>> >>> closing in on that with the portability protos, but many (java)
>> >>> runners actually share the common runner core libraries and that is
>> >>> even less set in stone.
>> >>>
>> >>> On the other hand, taking responsibility for maintaining all runners
>> >>> is not a tenable or scalable position for the Beam project. If a
>> >>> runner is merged, it should be understood that it can be "un-merged"
>> >>> if it causes a maintenance burden. A completely separate
>> >>> project/repository makes this less messy.
>> >>>
>> >>> On Thu, Mar 5, 2020 at 10:01 AM Kenneth Knowles <[email protected]>
>> wrote:
>> >>> >
>> >>> > I agree with both of you, mostly :-)
>> >>> >
>> >>> > The monorepo approach doesn't work/scale well for shipped libraries
>> (name a Google library that silently just works and never causes any
>> dependency problems) and the pain we feel has been constant and increasing,
>> but I don't think we are at the breaking point.
>> >>> >
>> >>> > But Google's big monorepo [1] demonstrates similar benefits to what
>> Kyle describes. In the early stages the benefit of not having to think too
>> hard about build/test infra and share it everywhere is a big help, and it
>> scales well. Eventually, shipping test utility libraries and compliance
>> suites can be equivalent. And to your point - it is very helpful for users
>> to know that they can use CassandraIO with the other Beam artifacts. This
>> is why Google requires the whole big repo to depend on a single version of
>> any externally-controlled artifact. But, yes, as a consequence it is
>> preposterously difficult to stay up to date, since literally anything can
>> block progress. You need a unified escalation chain for that policy to make
>> sense. It is the definition of a healthy Apache project to *not* have that
>> (PMC is different).
>> >>> >
>> >>> > Independent dependencies, independent git histories, and
>> independent release cadence/process are all separate discussions.
>> >>> >
>> >>> > It is a broader question than this particular contribution, so
>> let's merge this runner before changing our whole way of doing things :-)
>> >>> >
>> >>> > Kenn
>> >>> >
>> >>> > [1]
>> https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext
>> (really quite a balanced analysis)
>> >>> >
>> >>> > On Wed, Mar 4, 2020 at 11:51 AM Kyle Weaver <[email protected]>
>> wrote:
>> >>> >>
>> >>> >> > Should runners, current and future, be in the same repository as
>> Beam
>> >>> >> > core?
>> >>> >>
>> >>> >> In the distant past, runners lived in their own repositories, and
>> then were donated to Beam. But Beam's current uber-repo setup allows a lot
>> of convenience. For example, a ton of code (including core functionality
>> and tests) is shared directly between runners, which is useful for keeping
>> runners up to date and ensuring consistent behavior between them (in other
>> words, maintainable and reliable).
>> >>> >>
>> >>> >> Generally, it is up to the authors of a particular Beam related
>> project/subproject to decide whether to host their code in Beam or in a
>> different repo, and up to the community to decide whether to take on the
>> donation, as discussed in previous threads on the Twister2 runner. In this
>> case, it seems there is agreement between the Twister2 runner authors and
>> the community that the runner can be hosted in Beam proper.
>> >>> >>
>> >>> >> There are examples of successful independent Beam projects, such
>> as Spotify's Scio, but having an independent project with its own releases
>> requires a lot of dedicated resources, and the bar for entry for extending
>> Beam should not be that high. All that's required of subproject authors is
>> that they keep the subproject in step with Beam. If they can't maintain it
>> any longer, the subproject can be allowed to bitrot without getting in
>> anyone's way. On the other hand, I'm not sure of the details with
>> Cassandra, but in general, a subproject should not have "the ability to
>> block progress" just because it is contained in the Beam uber-repo.
>> >>> >>
>> >>> >> tl;dr Having an uber repo generally seems to work for Beam.
>> Exceptions are few enough to be handled on a case-by-case basis.
>> >>> >>
>> >>> >> On Wed, Mar 4, 2020 at 11:12 AM Elliotte Rusty Harold <
>> [email protected]> wrote:
>> >>> >>>
>> >>> >>> Generic question without commenting on Twister2 specifically:
>> >>> >>>
>> >>> >>> Should runners, current and future, be in the same repository as
>> Beam
>> >>> >>> core? Can or should they be completely separate products with
>> their
>> >>> >>> own release cycles?
>> >>> >>>
>> >>> >>> Generally, loose coupling leads to more maintainable, reliable
>> >>> >>> projects. Specifically, Cassandra is holding back some other
>> changes
>> >>> >>> in Beam and I really wish it didn't have the ability to block
>> >>> >>> progress. The more different runners we have in core, the worse
>> this
>> >>> >>> problem is likely to become.
>> >>> >>>
>> >>> >>>
>> >>> >>> On Wed, Mar 4, 2020 at 2:03 PM Pulasthi Supun Wickramasinghe
>> >>> >>> <[email protected]> wrote:
>> >>> >>> >
>> >>> >>> > Hi
>> >>> >>> >
>> >>> >>> > I believe the pull request is pretty complete now with the help
>> of Ismaël. Kenn, would you be able to take a look at it and suggest any
>> changes if needed?. The build checks and validations tests are passing at
>> the moment.  I will start working on the documentation that you mentioned
>> in an earlier email separately.
>> >>> >>> >
>> >>> >>> > Best Regards,
>> >>> >>> > Pulasthi
>> >>> >>> >
>> >>> >>> >
>> >>> >>> >
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > On Tue, Feb 18, 2020 at 1:45 PM Pulasthi Supun Wickramasinghe <
>> [email protected]> wrote:
>> >>> >>> >>
>> >>> >>> >> Hi All,
>> >>> >>> >>
>> >>> >>> >> I have created the initial pull request [1] to contribute the
>> Twister2 Beam runner to the Apache Beam codebase. More information on
>> Twister2 can be found here[2] and the Twister2 codebase is available
>> here[3]. At the moment only batch mode is supported in the runner, but we
>> are planning to add stream support and implement a portable runner for
>> Twister2 in the near future.
>> >>> >>> >>
>> >>> >>> >> As Kenn pointed out in an earlier email it would be great to
>> have inputs from the community regarding this contribution since it is a
>> sizable one. I am sure there are many improvements that can be done in the
>> contributed codebase with input from the community.
>> >>> >>> >>
>> >>> >>> >> [1] https://github.com/apache/beam/pull/10888
>> >>> >>> >> [2] https://twister2.org/
>> >>> >>> >> [3] https://github.com/DSC-SPIDAL/twister2
>> >>> >>> >>
>> >>> >>> >> Best Regards,
>> >>> >>> >> Pulasthi
>> >>> >>> >> --
>> >>> >>> >> Pulasthi S. Wickramasinghe
>> >>> >>> >> PhD Candidate  | Research Assistant
>> >>> >>> >> School of Informatics and Computing | Digital Science Center
>> >>> >>> >> Indiana University, Bloomington
>> >>> >>> >> cell: 224-386-9035 <(224)%20386-9035>
>> >>> >>> >
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > --
>> >>> >>> > Pulasthi S. Wickramasinghe
>> >>> >>> > PhD Candidate  | Research Assistant
>> >>> >>> > School of Informatics and Computing | Digital Science Center
>> >>> >>> > Indiana University, Bloomington
>> >>> >>> > cell: 224-386-9035 <(224)%20386-9035>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> --
>> >>> >>> Elliotte Rusty Harold
>> >>> >>> [email protected]
>> >>
>> >>
>> >>
>> >> --
>> >> Pulasthi S. Wickramasinghe
>> >> PhD Candidate  | Research Assistant
>> >> School of Informatics and Computing | Digital Science Center
>> >> Indiana University, Bloomington
>> >> cell: 224-386-9035 <(224)%20386-9035>
>>
>

Reply via email to