Ankur/Eugene, When you have a chance, please also update the Flink section of: https://docs.google.com/spreadsheets/d/1KDa_FGn1ShjomGd-UUDOhuh2q73de2tPz6BqHpzqvNI/edit#gid=0
Thanks! On Thu, Jun 28, 2018 at 10:29 AM Thomas Weise <t...@apache.org> wrote: > The command to run the job server appears to be: ./gradlew -p > runners/flink/job-server runShadow > > Can you please provide the equivalent of the super basic Python example > from the prototype: > > > https://github.com/bsidhom/beam/blob/hacking-job-server/sdks/python/flink-example.py > > Looks as if the Python side runner changed: > > Traceback (most recent call last): > File "flink-example.py", line 7, in <module> > from apache_beam.runners.portability import universal_local_runner > ImportError: cannot import name universal_local_runner > > Thanks, > Thomas > > > On Wed, Jun 27, 2018 at 9:34 PM Eugene Kirpichov <kirpic...@google.com> > wrote: > >> Hi! >> >> Those instructions are not current and I think should be discarded as >> they referred to a particular effort that is over - +Ankur Goenka >> <goe...@google.com> is, I believe, working on the remaining finishing >> touches for running from a clean clone of Beam master and documenting how >> to do that; could you help Thomas so we can start looking at what the >> streaming runner is missing? >> >> We'll need to document this in a more prominent place. When we get to a >> state where we can run Python WordCount from master, we'll need to document >> it somewhere on the main portability page and/or the getting started guide; >> when we can run something more serious, e.g. Tensorflow pipelines, that >> will be worth a Beam blog post and worth documenting in the TFX >> documentation. >> >> On Wed, Jun 27, 2018 at 5:35 AM Thomas Weise <t...@apache.org> wrote: >> >>> Hi Eugene, >>> >>> The basic streaming translation is already in place from the prototype, >>> though I have not verified it on the master branch yet. >>> >>> Are the user instructions for the portable Flink runner at >>> https://s.apache.org/beam-portability-team-doc current? >>> >>> (I don't have a dependency on SDF since we are going to use custom >>> native Flink sources/sinks at this time.) >>> >>> Thanks, >>> Thomas >>> >>> >>> On Tue, Jun 26, 2018 at 2:13 AM Eugene Kirpichov <kirpic...@google.com> >>> wrote: >>> >>>> Hi! >>>> >>>> Wanted to let you know that I've just merged the PR that adds >>>> checkpointable SDF support to the portable reference runner (ULR) and the >>>> Java SDK harness: >>>> >>>> https://github.com/apache/beam/pull/5566 >>>> >>>> So now we have a reference implementation of SDF support in a portable >>>> runner, and a reference implementation of SDF support in a portable SDK >>>> harness. >>>> From here on, we need to replicate this support in other portable >>>> runners and other harnesses. The obvious targets are Flink and Python >>>> respectively. >>>> >>>> Chamikara was going to work on the Python harness. +Thomas Weise >>>> <t...@apache.org> Would you be interested in the Flink portable >>>> streaming runner side? It is of course blocked by having the rest of that >>>> runner working in streaming mode though (the batch mode is practically done >>>> - will send you a separate note about the status of that). >>>> >>>> On Fri, Mar 23, 2018 at 12:20 PM Eugene Kirpichov <kirpic...@google.com> >>>> wrote: >>>> >>>>> Luke is right - unbounded sources should go through SDF. I am >>>>> currently working on adding such support to Fn API. >>>>> The relevant document is s.apache.org/beam-breaking-fusion (note: it >>>>> focuses on a much more general case, but also considers in detail the >>>>> specific case of running unbounded sources on Fn API), and the first >>>>> related PR is https://github.com/apache/beam/pull/4743 . >>>>> >>>>> Ways you can help speed up this effort: >>>>> - Make necessary changes to Apex runner per se to support regular SDFs >>>>> in streaming (without portability). They will likely largely carry over to >>>>> portable world. I recall that the Apex runner had some level of support of >>>>> SDFs, but didn't pass the ValidatesRunner tests yet. >>>>> - (general to Beam, not Apex-related per se) Implement the translation >>>>> of Read.from(UnboundedSource) via impulse, which will require implementing >>>>> an SDF that reads from a given UnboundedSource (taking the UnboundedSource >>>>> as an element). This should be fairly straightforward and will allow all >>>>> portable runners to take advantage of existing UnboundedSource's. >>>>> >>>>> >>>>> On Fri, Mar 23, 2018 at 3:08 PM Lukasz Cwik <lc...@google.com> wrote: >>>>> >>>>>> Using impulse is a precursor for both bounded and unbounded SDF. >>>>>> >>>>>> This JIRA represents the work that would be to add support for >>>>>> unbounded SDF using portability APIs: >>>>>> https://issues.apache.org/jira/browse/BEAM-2939 >>>>>> >>>>>> >>>>>> On Fri, Mar 23, 2018 at 11:46 AM Thomas Weise <t...@apache.org> wrote: >>>>>> >>>>>>> So for streaming, we will need the Impulse translation for bounded >>>>>>> input, identical with batch, and then in addition to that support for >>>>>>> SDF? >>>>>>> >>>>>>> Any pointers what's involved in adding the SDF support? Is it runner >>>>>>> specific? Does the ULR cover it? >>>>>>> >>>>>>> >>>>>>> On Fri, Mar 23, 2018 at 11:26 AM, Lukasz Cwik <lc...@google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> All "sources" in portability will use splittable DoFns for >>>>>>>> execution. >>>>>>>> >>>>>>>> Specifically, runners will need to be able to checkpoint unbounded >>>>>>>> sources to get a minimum viable pipeline working. >>>>>>>> For bounded pipelines, a DoFn can read the contents of a bounded >>>>>>>> source. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Mar 23, 2018 at 10:52 AM Thomas Weise <t...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm looking at the portable pipeline translation for streaming. I >>>>>>>>> understand that for batch pipelines, it is sufficient to translate >>>>>>>>> Impulse. >>>>>>>>> >>>>>>>>> What is the intended path to support unbounded sources? >>>>>>>>> >>>>>>>>> The goal here is to get a minimum translation working that will >>>>>>>>> allow streaming wordcount execution. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Thomas >>>>>>>>> >>>>>>>>> >>>>>>>