Re: Thoughts on a reference runner to invest in?

Robert Bradshaw Wed, 13 Feb 2019 00:47:49 -0800

I agree, it's useful for runners that are used for tests (including testing
SDKs) to push into the dark corners of what's allowed by the spec. I think
this can be added (where they don't already exist) to existing
non-production runners. (Whether a direct runner should be considered
production or not depends on who you ask...)


On Wed, Feb 13, 2019 at 2:49 AM Daniel Oliveira <danolive...@google.com>
wrote:

> +1 to Kenn's point. Regardless of whether we go with a Python runner or a
> Java runner, I think we should have at least one portable runner that isn't
> a production runner for the reasons he outlined.
>
> As for the rest of the discussion, it sounds like people are generally
> supportive of having the Python FnApiRunner as that runner, and using Flink
> as a reference implementation for portability in Java.
>
> On Tue, Feb 12, 2019 at 4:37 PM Kenneth Knowles <k...@apache.org> wrote:
>
>>
>> On Tue, Feb 12, 2019 at 8:59 AM Thomas Weise <t...@apache.org> wrote:
>>
>>> The Java ULR initially provided some value for the portability effort as
>>> Max mentions. It helped to develop the shared library for all Java runners
>>> and the job server functionality.
>>>
>>> However, I think the same could have been accomplished by developing the
>>> Flink runner instead of the Java ULR from the get go. This is also what
>>> happened later last year when support for state, timers and metrics was
>>> added to the portable Flink runner first and the ULR still does not support
>>> those features [1].
>>>
>>> Since all (or most) Java based runners that are based on another ASF
>>> project support embedded execution, I think it might make sense to
>>> discontinue separate direct runners for Java and instead focus efforts on
>>> making the runners that folks would also use in production better?
>>>
>>
>> Caveat: if people only test using embedded execution of a production
>> runner, they are quite likely to depend on quirks of that runner, such as
>> bundle size, fusion, whether shuffle is also checkpoint, etc. I think
>> there's a lot of value in an antagonistic testing runner, which is
>> something the Java DirectRunner tried to do with GBK random ordering,
>> checking illegal mutations, checking encodability. These were all driven by
>> real user needs and each caught a lot of user bugs. That said, I wouldn't
>> want to maintain an extra runner, but would like to put these into a
>> portable runner, whichever it is.
>>
>> Kenn
>>
>>
>>>
>>> As for Python (and hopefully soon Go), it makes a lot of sense to have a
>>> simple to use and stable runner that can be used for local development. At
>>> the moment, the Py FnApiRunner seems the best candidate to serve as
>>> reference for portability.
>>>
>>> On a related note, we should probably also consider making pure Java
>>> pipeline execution via portability framework on a Java runner simpler and
>>> more efficient. We already use embedded environment for testing. If we also
>>> inline/embed the job server and this becomes readily available and easy to
>>> use, it might improve chances of other runners migrating to portability
>>> sooner.
>>>
>>> Thomas
>>>
>>> [1] https://s.apache.org/apache-beam-portability-support-table
>>>
>>>
>>>
>>> On Tue, Feb 12, 2019 at 3:34 AM Maximilian Michels <m...@apache.org>
>>> wrote:
>>>
>>>> Do you consider job submission and artifact staging part of the
>>>> ReferenceRunner? If so, these parts have been reused or served as a
>>>> model for the portable FlinkRunner. So they had some value.
>>>>
>>>> A reference implementation helps Runner authors to understand and reuse
>>>> the code. However, I agree that the Flink implementation is more
>>>> helpful
>>>> to Runners authors than a ReferenceRunner which was designed for single
>>>> node testing.
>>>>
>>>> I think there are three parts which help to push forward portability:
>>>>
>>>> 1) Good library support for new portable Runners (Java)
>>>> 2) A reference implementation of a distributed Runner (Flink)
>>>> 3) An easy way for users to run/test portable Pipelines (Python via
>>>> FnApiRunner)
>>>>
>>>> The main motivation for the portability layer is supporting additional
>>>> language to Java. Most users will be using Python, so focusing on a
>>>> good
>>>> reference Runner in Python is key.
>>>>
>>>> -Max
>>>>
>>>> On 12.02.19 10:11, Robert Bradshaw wrote:
>>>> > This is certainly an interesting question, and I definitely have my
>>>> > opinions, but am curious as to what others think as well.
>>>> >
>>>> > One thing that I think wasn't as clear from the outset is
>>>> distinguishing
>>>> > between the development of runners/core-java and development of a
>>>> Java
>>>> > reference runner itself. With the work on work on moving Flink to
>>>> > portability, it turned out that work on the latter was not a
>>>> > prerequisite for work on the former, and runners/core-java is the
>>>> > artifact that other runners want to build on. I think that it is also
>>>> > the case, as suggested, that a distributed runner's use of this
>>>> shared
>>>> > library is a better reference point (for other distributed runners)
>>>> than
>>>> > one using the direct runner (e.g. there is a much more obvious
>>>> > delineation between the runner's responsibility and Beam code than in
>>>> > the direct runner where the boundaries between orchestration,
>>>> execution,
>>>> > and other concerns are not as clear).
>>>> >
>>>> > As well as serving as a reference to runner implementers, the
>>>> reference
>>>> > runner can also be useful for prototyping (here I think Python holds
>>>> an
>>>> > advantage, but we're getting into subjective areas now), documenting
>>>> (or
>>>> > ideally augmenting the documentation of) the spec (here I'd say a
>>>> > smaller advantage to Python, but neither runner clean,
>>>> straightforward,
>>>> > and documented enough to serve this purpose well yet), and serving as
>>>> a
>>>> > lightweight universal local runner against which to develop (and,
>>>> > possibly use long term in place of a direct runner) new SDKs (here
>>>> > you'll get a wide variety of answers whether Python or Java is easier
>>>> to
>>>> > take on as a dependency for a third language, or we could just
>>>> package
>>>> > it up in a docker image and take docker as a dependency).
>>>> >
>>>> > Another more pragmatic note is that one thing that helped both the
>>>> Flink
>>>> > and FnApiRunner forwards is that they were driven forward by actual
>>>> > usecases--Lyft has actual Python (necessitating portable) pipelines
>>>> they
>>>> > want to run on Flink, and the FnApiRunner is the direct runner for
>>>> > Python. The Java ULR (at least where it is now) sits in an awkward
>>>> place
>>>> > where its only role is to be a reference rather than be used, which
>>>> (in
>>>> > a world of limited resources) makes it harder to justify investment.
>>>> >
>>>> > - Robert
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Feb 12, 2019 at 3:53 AM Kenneth Knowles <k...@apache.org
>>>> > <mailto:k...@apache.org>> wrote:
>>>> >
>>>> >     Interesting silence here. You've got it right that the reason we
>>>> >     initially chose Java was because of the cross-runner sharing. The
>>>> >     reference runner could be the first target runner for any new
>>>> >     feature and then its work could be directly (or indirectly via
>>>> >     copy/paste/modify if it works better) be used in other runners.
>>>> >     Examples:
>>>> >
>>>> >       - The implementations of (pre-portability) state & timers in
>>>> >     runners/core-java and prototyped in the Java DirectRunner made it
>>>> a
>>>> >     matter of a couple of days to implement on other runners, and they
>>>> >     saw pretty quick adoption.
>>>> >       - Probably the same could be said for the first drafts of the
>>>> >     runners, which re-used a bunch of runners/core-java and had each
>>>> >     others' translation code as a reference.
>>>> >
>>>> >     I'm interested if anyone would be willing to confirm if it is
>>>> >     because the FlinkRunner has forged ahead and the Dataflow worker
>>>> is
>>>> >     open source. It makes sense that the code from a distributed
>>>> runner
>>>> >     is an even better reference point if you are building another
>>>> >     distributed runner. From the look of it, the SamzaRunner had no
>>>> >     trouble getting started on portability.
>>>> >
>>>> >     Kenn
>>>> >
>>>> >     On Mon, Feb 11, 2019 at 6:04 PM Daniel Oliveira
>>>> >     <danolive...@google.com <mailto:danolive...@google.com>> wrote:
>>>> >
>>>> >         Yeah, the FnApiRunner is what I'm leaning towards too. I
>>>> wasn't
>>>> >         sure how much demand there was for an actual reference
>>>> >         implementation in Java though, so I was hoping there were
>>>> runner
>>>> >         authors that would want to chime in.
>>>> >
>>>> >         On the other hand, the Flink runner could serve as a reference
>>>> >         implementation for portable features since it's further along,
>>>> >         so maybe it's not an issue regardless.
>>>> >
>>>> >         On Mon, Feb 11, 2019 at 1:09 PM Sam Rohde <sro...@google.com
>>>> >         <mailto:sro...@google.com>> wrote:
>>>> >
>>>> >             Thanks for starting this thread. If I had to guess, I
>>>> would
>>>> >             say there is more of a demand for Python as it's more
>>>> widely
>>>> >             used for data scientists/ analytics. Being pragmatic, the
>>>> >             FnApiRunner already has more feature work than the Java so
>>>> >             we should go with that.
>>>> >
>>>> >             -Sam
>>>> >
>>>> >             On Fri, Feb 8, 2019 at 10:07 AM Daniel Oliveira
>>>> >             <danolive...@google.com <mailto:danolive...@google.com>>
>>>> wrote:
>>>> >
>>>> >                 Hello Beam dev community,
>>>> >
>>>> >                 For those who don't know me, I work for Google and
>>>> I've
>>>> >                 been working on the Java reference runner, which is a
>>>> >                 portable, local Java runner (it's basically the direct
>>>> >                 runner with the portability APIs implemented). Our
>>>> goal
>>>> >                 in working on this was to have a portable runner which
>>>> >                 ran locally so it could be used by users for testing
>>>> >                 portable pipelines, devs for testing new features with
>>>> >                 portability, and for runner authors to provide a
>>>> simple
>>>> >                 reference implementation of a portable runner.
>>>> >
>>>> >                 Due to various circumstances though, progress on the
>>>> >                 Java reference runner has been pretty slow, and a
>>>> Python
>>>> >                 runner which does pretty much the same things was made
>>>> >                 to aid portability development in Python (called the
>>>> >                 FnApiRunner). This runner is currently further along
>>>> in
>>>> >                 feature work than the Java reference runner, so we've
>>>> >                 been reevaluating if we should switch to investing in
>>>> it
>>>> >                 instead.
>>>> >
>>>> >                 My question to the community is: Which runner do you
>>>> >                 think would be more valuable to the dev community and
>>>> >                 Beam users? For those of you who are runner authors,
>>>> do
>>>> >                 you have a preference for what language you'd like to
>>>> >                 see a reference implementation in?
>>>> >
>>>> >                 Thanks,
>>>> >                 Daniel Oliveira
>>>> >
>>>>
>>>

Re: Thoughts on a reference runner to invest in?

Reply via email to