Re: Thoughts on a reference runner to invest in?

Ismaël Mejía Fri, 15 Feb 2019 06:46:38 -0800

Just a minor point, building a comunity around the Java-based ULR has
not happened in part because there has not been a lot of effort to try
to do so, focus has been into creating production ready runners (which
makes sense), but the opportunity is still there.


Kenn is right a more 'formalized' and 'readable' runner would be
amazing, but probably hard to get right without good gRPC support.

On Fri, Feb 15, 2019 at 6:05 AM Kenneth Knowles <k...@apache.org> wrote:
>
> Interesting point about community and the fact that it didn't build a 
> Java-based ULR even though it has been a possibility for a long time.
>
> It makes sense to me. A non-Java SDK needs portability to run on Beam's 
> distributed runners, so building the portable SDK harness is key, unlike for 
> Java. And to build it, a local portability-based runner is a great help 
> (can't really imagine doing it without one). And of course building it in 
> Python makes sense if you are steeped in Python.
>
> Joking-but-not-Joking the best reference runner would probably be in some 
> less popular but very readable functional language so it is different from 
> every SDK :-). I've looked into it and discovered that gRPC support is not 
> great...
>
> Kenn
>
> On Thu, Feb 14, 2019 at 5:47 AM Robert Bradshaw <rober...@google.com> wrote:
>>
>> I think it's good to distinguish between direct runners (which would
>> be good to have in every language, and can grow in sophistication with
>> the userbase) and a fully universal reference runner. We should of
>> course continue to grow and maintain the java-runners-core shared
>> library, possibly as driven by the various production runners which
>> has been the most productive to date. (The point about community is a
>> good one. Unfortunately over the past 1.5 years the bigger Java
>> community has not resulted in a more complete Java ULR (in terms of
>> number of contributors or features/maturity), and it's unclear what
>> would change that in the future.)
>>
>> It would be really great to have (at least) two completely separate
>> implementations, but (at the moment at least) I see that as lower
>> value than accelerating the efforts to get existing production runners
>> onto portability.
>>
>> On Thu, Feb 14, 2019 at 2:01 PM Ismaël Mejía <ieme...@gmail.com> wrote:
>> >
>> > This is a really interesting and important discussion. Having multiple
>> > reference runners can have its pros and cons. It is all about
>> > tradeoffs. From the end user point of view it can feel weird to deal
>> > with tools and packaging of a different ecosystem, e.g. python devs
>> > dealing with all the quirkiness of Java packaging, or the viceversa
>> > Java developers dealing with pip and friends. So having a reference
>> > runner per language would be more natural and help also valídate the
>> > portability concept, however having multiple reference runners sounds
>> > harder from the maintenance point of view.
>> >
>> > Most of the software in the domain of beam have been traditionally
>> > written in Java so there is a BIG advantage of ready to use (and
>> > mature) libraries and reusable components (also the reference runner
>> > may profit of the librarires that Thomas and others in the community
>> > have developed for multi runner s). This is a big win, but more
>> > important, we can have more eyes looking and contributing improvemetns
>> > and fixes that will benefit the reference runner and others.
>> >
>> > Having a reference runner per language would be nice but if we must
>> > choose only one language I prefer it to be Java just because we have a
>> > bigger community that can contribute and improve it. We may work on
>> > making the distribution of such runner more easier or friendly for
>> > users of different languages.
>> >
>> > On Wed, Feb 13, 2019 at 3:47 AM Robert Bradshaw <rober...@google.com> 
>> > wrote:
>> > >
>> > > I agree, it's useful for runners that are used for tests (including 
>> > > testing SDKs) to push into the dark corners of what's allowed by the 
>> > > spec. I think this can be added (where they don't already exist) to 
>> > > existing non-production runners. (Whether a direct runner should be 
>> > > considered production or not depends on who you ask...)
>> > >
>> > > On Wed, Feb 13, 2019 at 2:49 AM Daniel Oliveira <danolive...@google.com> 
>> > > wrote:
>> > >>
>> > >> +1 to Kenn's point. Regardless of whether we go with a Python runner or 
>> > >> a Java runner, I think we should have at least one portable runner that 
>> > >> isn't a production runner for the reasons he outlined.
>> > >>
>> > >> As for the rest of the discussion, it sounds like people are generally 
>> > >> supportive of having the Python FnApiRunner as that runner, and using 
>> > >> Flink as a reference implementation for portability in Java.
>> > >>
>> > >> On Tue, Feb 12, 2019 at 4:37 PM Kenneth Knowles <k...@apache.org> wrote:
>> > >>>
>> > >>>
>> > >>> On Tue, Feb 12, 2019 at 8:59 AM Thomas Weise <t...@apache.org> wrote:
>> > >>>>
>> > >>>> The Java ULR initially provided some value for the portability effort 
>> > >>>> as Max mentions. It helped to develop the shared library for all Java 
>> > >>>> runners and the job server functionality.
>> > >>>>
>> > >>>> However, I think the same could have been accomplished by developing 
>> > >>>> the Flink runner instead of the Java ULR from the get go. This is 
>> > >>>> also what happened later last year when support for state, timers and 
>> > >>>> metrics was added to the portable Flink runner first and the ULR 
>> > >>>> still does not support those features [1].
>> > >>>>
>> > >>>> Since all (or most) Java based runners that are based on another ASF 
>> > >>>> project support embedded execution, I think it might make sense to 
>> > >>>> discontinue separate direct runners for Java and instead focus 
>> > >>>> efforts on making the runners that folks would also use in production 
>> > >>>> better?
>> > >>>
>> > >>>
>> > >>> Caveat: if people only test using embedded execution of a production 
>> > >>> runner, they are quite likely to depend on quirks of that runner, such 
>> > >>> as bundle size, fusion, whether shuffle is also checkpoint, etc. I 
>> > >>> think there's a lot of value in an antagonistic testing runner, which 
>> > >>> is something the Java DirectRunner tried to do with GBK random 
>> > >>> ordering, checking illegal mutations, checking encodability. These 
>> > >>> were all driven by real user needs and each caught a lot of user bugs. 
>> > >>> That said, I wouldn't want to maintain an extra runner, but would like 
>> > >>> to put these into a portable runner, whichever it is.
>> > >>>
>> > >>> Kenn
>> > >>>
>> > >>>>
>> > >>>>
>> > >>>> As for Python (and hopefully soon Go), it makes a lot of sense to 
>> > >>>> have a simple to use and stable runner that can be used for local 
>> > >>>> development. At the moment, the Py FnApiRunner seems the best 
>> > >>>> candidate to serve as reference for portability.
>> > >>>>
>> > >>>> On a related note, we should probably also consider making pure Java 
>> > >>>> pipeline execution via portability framework on a Java runner simpler 
>> > >>>> and more efficient. We already use embedded environment for testing. 
>> > >>>> If we also inline/embed the job server and this becomes readily 
>> > >>>> available and easy to use, it might improve chances of other runners 
>> > >>>> migrating to portability sooner.
>> > >>>>
>> > >>>> Thomas
>> > >>>>
>> > >>>> [1] https://s.apache.org/apache-beam-portability-support-table
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> On Tue, Feb 12, 2019 at 3:34 AM Maximilian Michels <m...@apache.org> 
>> > >>>> wrote:
>> > >>>>>
>> > >>>>> Do you consider job submission and artifact staging part of the
>> > >>>>> ReferenceRunner? If so, these parts have been reused or served as a
>> > >>>>> model for the portable FlinkRunner. So they had some value.
>> > >>>>>
>> > >>>>> A reference implementation helps Runner authors to understand and 
>> > >>>>> reuse
>> > >>>>> the code. However, I agree that the Flink implementation is more 
>> > >>>>> helpful
>> > >>>>> to Runners authors than a ReferenceRunner which was designed for 
>> > >>>>> single
>> > >>>>> node testing.
>> > >>>>>
>> > >>>>> I think there are three parts which help to push forward portability:
>> > >>>>>
>> > >>>>> 1) Good library support for new portable Runners (Java)
>> > >>>>> 2) A reference implementation of a distributed Runner (Flink)
>> > >>>>> 3) An easy way for users to run/test portable Pipelines (Python via
>> > >>>>> FnApiRunner)
>> > >>>>>
>> > >>>>> The main motivation for the portability layer is supporting 
>> > >>>>> additional
>> > >>>>> language to Java. Most users will be using Python, so focusing on a 
>> > >>>>> good
>> > >>>>> reference Runner in Python is key.
>> > >>>>>
>> > >>>>> -Max
>> > >>>>>
>> > >>>>> On 12.02.19 10:11, Robert Bradshaw wrote:
>> > >>>>> > This is certainly an interesting question, and I definitely have my
>> > >>>>> > opinions, but am curious as to what others think as well.
>> > >>>>> >
>> > >>>>> > One thing that I think wasn't as clear from the outset is 
>> > >>>>> > distinguishing
>> > >>>>> > between the development of runners/core-java and development of a 
>> > >>>>> > Java
>> > >>>>> > reference runner itself. With the work on work on moving Flink to
>> > >>>>> > portability, it turned out that work on the latter was not a
>> > >>>>> > prerequisite for work on the former, and runners/core-java is the
>> > >>>>> > artifact that other runners want to build on. I think that it is 
>> > >>>>> > also
>> > >>>>> > the case, as suggested, that a distributed runner's use of this 
>> > >>>>> > shared
>> > >>>>> > library is a better reference point (for other distributed 
>> > >>>>> > runners) than
>> > >>>>> > one using the direct runner (e.g. there is a much more obvious
>> > >>>>> > delineation between the runner's responsibility and Beam code than 
>> > >>>>> > in
>> > >>>>> > the direct runner where the boundaries between orchestration, 
>> > >>>>> > execution,
>> > >>>>> > and other concerns are not as clear).
>> > >>>>> >
>> > >>>>> > As well as serving as a reference to runner implementers, the 
>> > >>>>> > reference
>> > >>>>> > runner can also be useful for prototyping (here I think Python 
>> > >>>>> > holds an
>> > >>>>> > advantage, but we're getting into subjective areas now), 
>> > >>>>> > documenting (or
>> > >>>>> > ideally augmenting the documentation of) the spec (here I'd say a
>> > >>>>> > smaller advantage to Python, but neither runner clean, 
>> > >>>>> > straightforward,
>> > >>>>> > and documented enough to serve this purpose well yet), and serving 
>> > >>>>> > as a
>> > >>>>> > lightweight universal local runner against which to develop (and,
>> > >>>>> > possibly use long term in place of a direct runner) new SDKs (here
>> > >>>>> > you'll get a wide variety of answers whether Python or Java is 
>> > >>>>> > easier to
>> > >>>>> > take on as a dependency for a third language, or we could just 
>> > >>>>> > package
>> > >>>>> > it up in a docker image and take docker as a dependency).
>> > >>>>> >
>> > >>>>> > Another more pragmatic note is that one thing that helped both the 
>> > >>>>> > Flink
>> > >>>>> > and FnApiRunner forwards is that they were driven forward by actual
>> > >>>>> > usecases--Lyft has actual Python (necessitating portable) 
>> > >>>>> > pipelines they
>> > >>>>> > want to run on Flink, and the FnApiRunner is the direct runner for
>> > >>>>> > Python. The Java ULR (at least where it is now) sits in an awkward 
>> > >>>>> > place
>> > >>>>> > where its only role is to be a reference rather than be used, 
>> > >>>>> > which (in
>> > >>>>> > a world of limited resources) makes it harder to justify 
>> > >>>>> > investment.
>> > >>>>> >
>> > >>>>> > - Robert
>> > >>>>> >
>> > >>>>> >
>> > >>>>> >
>> > >>>>> > On Tue, Feb 12, 2019 at 3:53 AM Kenneth Knowles <k...@apache.org
>> > >>>>> > <mailto:k...@apache.org>> wrote:
>> > >>>>> >
>> > >>>>> >     Interesting silence here. You've got it right that the reason 
>> > >>>>> > we
>> > >>>>> >     initially chose Java was because of the cross-runner sharing. 
>> > >>>>> > The
>> > >>>>> >     reference runner could be the first target runner for any new
>> > >>>>> >     feature and then its work could be directly (or indirectly via
>> > >>>>> >     copy/paste/modify if it works better) be used in other runners.
>> > >>>>> >     Examples:
>> > >>>>> >
>> > >>>>> >       - The implementations of (pre-portability) state & timers in
>> > >>>>> >     runners/core-java and prototyped in the Java DirectRunner made 
>> > >>>>> > it a
>> > >>>>> >     matter of a couple of days to implement on other runners, and 
>> > >>>>> > they
>> > >>>>> >     saw pretty quick adoption.
>> > >>>>> >       - Probably the same could be said for the first drafts of the
>> > >>>>> >     runners, which re-used a bunch of runners/core-java and had 
>> > >>>>> > each
>> > >>>>> >     others' translation code as a reference.
>> > >>>>> >
>> > >>>>> >     I'm interested if anyone would be willing to confirm if it is
>> > >>>>> >     because the FlinkRunner has forged ahead and the Dataflow 
>> > >>>>> > worker is
>> > >>>>> >     open source. It makes sense that the code from a distributed 
>> > >>>>> > runner
>> > >>>>> >     is an even better reference point if you are building another
>> > >>>>> >     distributed runner. From the look of it, the SamzaRunner had no
>> > >>>>> >     trouble getting started on portability.
>> > >>>>> >
>> > >>>>> >     Kenn
>> > >>>>> >
>> > >>>>> >     On Mon, Feb 11, 2019 at 6:04 PM Daniel Oliveira
>> > >>>>> >     <danolive...@google.com <mailto:danolive...@google.com>> wrote:
>> > >>>>> >
>> > >>>>> >         Yeah, the FnApiRunner is what I'm leaning towards too. I 
>> > >>>>> > wasn't
>> > >>>>> >         sure how much demand there was for an actual reference
>> > >>>>> >         implementation in Java though, so I was hoping there were 
>> > >>>>> > runner
>> > >>>>> >         authors that would want to chime in.
>> > >>>>> >
>> > >>>>> >         On the other hand, the Flink runner could serve as a 
>> > >>>>> > reference
>> > >>>>> >         implementation for portable features since it's further 
>> > >>>>> > along,
>> > >>>>> >         so maybe it's not an issue regardless.
>> > >>>>> >
>> > >>>>> >         On Mon, Feb 11, 2019 at 1:09 PM Sam Rohde 
>> > >>>>> > <sro...@google.com
>> > >>>>> >         <mailto:sro...@google.com>> wrote:
>> > >>>>> >
>> > >>>>> >             Thanks for starting this thread. If I had to guess, I 
>> > >>>>> > would
>> > >>>>> >             say there is more of a demand for Python as it's more 
>> > >>>>> > widely
>> > >>>>> >             used for data scientists/ analytics. Being pragmatic, 
>> > >>>>> > the
>> > >>>>> >             FnApiRunner already has more feature work than the 
>> > >>>>> > Java so
>> > >>>>> >             we should go with that.
>> > >>>>> >
>> > >>>>> >             -Sam
>> > >>>>> >
>> > >>>>> >             On Fri, Feb 8, 2019 at 10:07 AM Daniel Oliveira
>> > >>>>> >             <danolive...@google.com 
>> > >>>>> > <mailto:danolive...@google.com>> wrote:
>> > >>>>> >
>> > >>>>> >                 Hello Beam dev community,
>> > >>>>> >
>> > >>>>> >                 For those who don't know me, I work for Google and 
>> > >>>>> > I've
>> > >>>>> >                 been working on the Java reference runner, which 
>> > >>>>> > is a
>> > >>>>> >                 portable, local Java runner (it's basically the 
>> > >>>>> > direct
>> > >>>>> >                 runner with the portability APIs implemented). Our 
>> > >>>>> > goal
>> > >>>>> >                 in working on this was to have a portable runner 
>> > >>>>> > which
>> > >>>>> >                 ran locally so it could be used by users for 
>> > >>>>> > testing
>> > >>>>> >                 portable pipelines, devs for testing new features 
>> > >>>>> > with
>> > >>>>> >                 portability, and for runner authors to provide a 
>> > >>>>> > simple
>> > >>>>> >                 reference implementation of a portable runner.
>> > >>>>> >
>> > >>>>> >                 Due to various circumstances though, progress on 
>> > >>>>> > the
>> > >>>>> >                 Java reference runner has been pretty slow, and a 
>> > >>>>> > Python
>> > >>>>> >                 runner which does pretty much the same things was 
>> > >>>>> > made
>> > >>>>> >                 to aid portability development in Python (called 
>> > >>>>> > the
>> > >>>>> >                 FnApiRunner). This runner is currently further 
>> > >>>>> > along in
>> > >>>>> >                 feature work than the Java reference runner, so 
>> > >>>>> > we've
>> > >>>>> >                 been reevaluating if we should switch to investing 
>> > >>>>> > in it
>> > >>>>> >                 instead.
>> > >>>>> >
>> > >>>>> >                 My question to the community is: Which runner do 
>> > >>>>> > you
>> > >>>>> >                 think would be more valuable to the dev community 
>> > >>>>> > and
>> > >>>>> >                 Beam users? For those of you who are runner 
>> > >>>>> > authors, do
>> > >>>>> >                 you have a preference for what language you'd like 
>> > >>>>> > to
>> > >>>>> >                 see a reference implementation in?
>> > >>>>> >
>> > >>>>> >                 Thanks,
>> > >>>>> >                 Daniel Oliveira
>> > >>>>> >

Re: Thoughts on a reference runner to invest in?

Reply via email to