I agree, it's useful for runners that are used for tests (including testing SDKs) to push into the dark corners of what's allowed by the spec. I think this can be added (where they don't already exist) to existing non-production runners. (Whether a direct runner should be considered production or not depends on who you ask...)
On Wed, Feb 13, 2019 at 2:49 AM Daniel Oliveira <danolive...@google.com> wrote: > +1 to Kenn's point. Regardless of whether we go with a Python runner or a > Java runner, I think we should have at least one portable runner that isn't > a production runner for the reasons he outlined. > > As for the rest of the discussion, it sounds like people are generally > supportive of having the Python FnApiRunner as that runner, and using Flink > as a reference implementation for portability in Java. > > On Tue, Feb 12, 2019 at 4:37 PM Kenneth Knowles <k...@apache.org> wrote: > >> >> On Tue, Feb 12, 2019 at 8:59 AM Thomas Weise <t...@apache.org> wrote: >> >>> The Java ULR initially provided some value for the portability effort as >>> Max mentions. It helped to develop the shared library for all Java runners >>> and the job server functionality. >>> >>> However, I think the same could have been accomplished by developing the >>> Flink runner instead of the Java ULR from the get go. This is also what >>> happened later last year when support for state, timers and metrics was >>> added to the portable Flink runner first and the ULR still does not support >>> those features [1]. >>> >>> Since all (or most) Java based runners that are based on another ASF >>> project support embedded execution, I think it might make sense to >>> discontinue separate direct runners for Java and instead focus efforts on >>> making the runners that folks would also use in production better? >>> >> >> Caveat: if people only test using embedded execution of a production >> runner, they are quite likely to depend on quirks of that runner, such as >> bundle size, fusion, whether shuffle is also checkpoint, etc. I think >> there's a lot of value in an antagonistic testing runner, which is >> something the Java DirectRunner tried to do with GBK random ordering, >> checking illegal mutations, checking encodability. These were all driven by >> real user needs and each caught a lot of user bugs. That said, I wouldn't >> want to maintain an extra runner, but would like to put these into a >> portable runner, whichever it is. >> >> Kenn >> >> >>> >>> As for Python (and hopefully soon Go), it makes a lot of sense to have a >>> simple to use and stable runner that can be used for local development. At >>> the moment, the Py FnApiRunner seems the best candidate to serve as >>> reference for portability. >>> >>> On a related note, we should probably also consider making pure Java >>> pipeline execution via portability framework on a Java runner simpler and >>> more efficient. We already use embedded environment for testing. If we also >>> inline/embed the job server and this becomes readily available and easy to >>> use, it might improve chances of other runners migrating to portability >>> sooner. >>> >>> Thomas >>> >>> [1] https://s.apache.org/apache-beam-portability-support-table >>> >>> >>> >>> On Tue, Feb 12, 2019 at 3:34 AM Maximilian Michels <m...@apache.org> >>> wrote: >>> >>>> Do you consider job submission and artifact staging part of the >>>> ReferenceRunner? If so, these parts have been reused or served as a >>>> model for the portable FlinkRunner. So they had some value. >>>> >>>> A reference implementation helps Runner authors to understand and reuse >>>> the code. However, I agree that the Flink implementation is more >>>> helpful >>>> to Runners authors than a ReferenceRunner which was designed for single >>>> node testing. >>>> >>>> I think there are three parts which help to push forward portability: >>>> >>>> 1) Good library support for new portable Runners (Java) >>>> 2) A reference implementation of a distributed Runner (Flink) >>>> 3) An easy way for users to run/test portable Pipelines (Python via >>>> FnApiRunner) >>>> >>>> The main motivation for the portability layer is supporting additional >>>> language to Java. Most users will be using Python, so focusing on a >>>> good >>>> reference Runner in Python is key. >>>> >>>> -Max >>>> >>>> On 12.02.19 10:11, Robert Bradshaw wrote: >>>> > This is certainly an interesting question, and I definitely have my >>>> > opinions, but am curious as to what others think as well. >>>> > >>>> > One thing that I think wasn't as clear from the outset is >>>> distinguishing >>>> > between the development of runners/core-java and development of a >>>> Java >>>> > reference runner itself. With the work on work on moving Flink to >>>> > portability, it turned out that work on the latter was not a >>>> > prerequisite for work on the former, and runners/core-java is the >>>> > artifact that other runners want to build on. I think that it is also >>>> > the case, as suggested, that a distributed runner's use of this >>>> shared >>>> > library is a better reference point (for other distributed runners) >>>> than >>>> > one using the direct runner (e.g. there is a much more obvious >>>> > delineation between the runner's responsibility and Beam code than in >>>> > the direct runner where the boundaries between orchestration, >>>> execution, >>>> > and other concerns are not as clear). >>>> > >>>> > As well as serving as a reference to runner implementers, the >>>> reference >>>> > runner can also be useful for prototyping (here I think Python holds >>>> an >>>> > advantage, but we're getting into subjective areas now), documenting >>>> (or >>>> > ideally augmenting the documentation of) the spec (here I'd say a >>>> > smaller advantage to Python, but neither runner clean, >>>> straightforward, >>>> > and documented enough to serve this purpose well yet), and serving as >>>> a >>>> > lightweight universal local runner against which to develop (and, >>>> > possibly use long term in place of a direct runner) new SDKs (here >>>> > you'll get a wide variety of answers whether Python or Java is easier >>>> to >>>> > take on as a dependency for a third language, or we could just >>>> package >>>> > it up in a docker image and take docker as a dependency). >>>> > >>>> > Another more pragmatic note is that one thing that helped both the >>>> Flink >>>> > and FnApiRunner forwards is that they were driven forward by actual >>>> > usecases--Lyft has actual Python (necessitating portable) pipelines >>>> they >>>> > want to run on Flink, and the FnApiRunner is the direct runner for >>>> > Python. The Java ULR (at least where it is now) sits in an awkward >>>> place >>>> > where its only role is to be a reference rather than be used, which >>>> (in >>>> > a world of limited resources) makes it harder to justify investment. >>>> > >>>> > - Robert >>>> > >>>> > >>>> > >>>> > On Tue, Feb 12, 2019 at 3:53 AM Kenneth Knowles <k...@apache.org >>>> > <mailto:k...@apache.org>> wrote: >>>> > >>>> > Interesting silence here. You've got it right that the reason we >>>> > initially chose Java was because of the cross-runner sharing. The >>>> > reference runner could be the first target runner for any new >>>> > feature and then its work could be directly (or indirectly via >>>> > copy/paste/modify if it works better) be used in other runners. >>>> > Examples: >>>> > >>>> > - The implementations of (pre-portability) state & timers in >>>> > runners/core-java and prototyped in the Java DirectRunner made it >>>> a >>>> > matter of a couple of days to implement on other runners, and they >>>> > saw pretty quick adoption. >>>> > - Probably the same could be said for the first drafts of the >>>> > runners, which re-used a bunch of runners/core-java and had each >>>> > others' translation code as a reference. >>>> > >>>> > I'm interested if anyone would be willing to confirm if it is >>>> > because the FlinkRunner has forged ahead and the Dataflow worker >>>> is >>>> > open source. It makes sense that the code from a distributed >>>> runner >>>> > is an even better reference point if you are building another >>>> > distributed runner. From the look of it, the SamzaRunner had no >>>> > trouble getting started on portability. >>>> > >>>> > Kenn >>>> > >>>> > On Mon, Feb 11, 2019 at 6:04 PM Daniel Oliveira >>>> > <danolive...@google.com <mailto:danolive...@google.com>> wrote: >>>> > >>>> > Yeah, the FnApiRunner is what I'm leaning towards too. I >>>> wasn't >>>> > sure how much demand there was for an actual reference >>>> > implementation in Java though, so I was hoping there were >>>> runner >>>> > authors that would want to chime in. >>>> > >>>> > On the other hand, the Flink runner could serve as a reference >>>> > implementation for portable features since it's further along, >>>> > so maybe it's not an issue regardless. >>>> > >>>> > On Mon, Feb 11, 2019 at 1:09 PM Sam Rohde <sro...@google.com >>>> > <mailto:sro...@google.com>> wrote: >>>> > >>>> > Thanks for starting this thread. If I had to guess, I >>>> would >>>> > say there is more of a demand for Python as it's more >>>> widely >>>> > used for data scientists/ analytics. Being pragmatic, the >>>> > FnApiRunner already has more feature work than the Java so >>>> > we should go with that. >>>> > >>>> > -Sam >>>> > >>>> > On Fri, Feb 8, 2019 at 10:07 AM Daniel Oliveira >>>> > <danolive...@google.com <mailto:danolive...@google.com>> >>>> wrote: >>>> > >>>> > Hello Beam dev community, >>>> > >>>> > For those who don't know me, I work for Google and >>>> I've >>>> > been working on the Java reference runner, which is a >>>> > portable, local Java runner (it's basically the direct >>>> > runner with the portability APIs implemented). Our >>>> goal >>>> > in working on this was to have a portable runner which >>>> > ran locally so it could be used by users for testing >>>> > portable pipelines, devs for testing new features with >>>> > portability, and for runner authors to provide a >>>> simple >>>> > reference implementation of a portable runner. >>>> > >>>> > Due to various circumstances though, progress on the >>>> > Java reference runner has been pretty slow, and a >>>> Python >>>> > runner which does pretty much the same things was made >>>> > to aid portability development in Python (called the >>>> > FnApiRunner). This runner is currently further along >>>> in >>>> > feature work than the Java reference runner, so we've >>>> > been reevaluating if we should switch to investing in >>>> it >>>> > instead. >>>> > >>>> > My question to the community is: Which runner do you >>>> > think would be more valuable to the dev community and >>>> > Beam users? For those of you who are runner authors, >>>> do >>>> > you have a preference for what language you'd like to >>>> > see a reference implementation in? >>>> > >>>> > Thanks, >>>> > Daniel Oliveira >>>> > >>>> >>>