Re: Current progress on Portable runners

Eugene Kirpichov Tue, 22 May 2018 12:26:12 -0700

Thanks all! Yeah, I'll update the Portability page with the status of this
project and other pointers this week or next (mostly out of office this
week).


On Fri, May 18, 2018 at 5:01 PM Thomas Weise <t...@apache.org> wrote:

> - Flink JobService: in review <https://github.com/apache/beam/pull/5262>
>
> That's TODO (above PR was merged, but it doesn't contain the Flink job
> service).
>
> Discussion about it is here:
> https://docs.google.com/document/d/1xOaEEJrMmiSHprd-WiYABegfT129qqF-idUBINjxz8s/edit?ts=5afa1238
>
> Thanks,
> Thomas
>
>
>
> On Fri, May 18, 2018 at 7:01 AM, Thomas Weise <t...@apache.org> wrote:
>
>> Most of it should probably go to
>> https://beam.apache.org/contribute/portability/
>>
>> Also for reference, here is the prototype doc: https://s.apache.org/beam-
>> portability-team-doc
>>
>> Thomas
>>
>> On Fri, May 18, 2018 at 5:35 AM, Kenneth Knowles <k...@google.com> wrote:
>>
>>> This is awesome. Would you be up for adding a brief description at
>>> https://beam.apache.org/contribute/#works-in-progress and maybe a
>>> pointer to a gdoc with something like the contents of this email? (my
>>> reasoning is (a) keep the contribution guide concise but (b) all this
>>> detail is helpful yet (c) the detail may be ever-changing so making a
>>> separate web page is not the best format)
>>>
>>> Kenn
>>>
>>> On Thu, May 17, 2018 at 3:13 PM Eugene Kirpichov <kirpic...@google.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> A little over a month ago, a large group of Beam community members has
>>>> been working a prototype of a portable Flink runner - that is, a runner
>>>> that can execute Beam pipelines on Flink via the Portability API
>>>> <https://s.apache.org/beam-runner-api>. The prototype was developed in
>>>> a separate branch
>>>> <https://github.com/bsidhom/beam/tree/hacking-job-server> and was
>>>> successfully demonstrated at Flink Forward, where it ran Python and Go
>>>> pipelines in a limited setting.
>>>>
>>>> Since then, a smaller group of people (Ankur Goenka, Axel Magnuson, Ben
>>>> Sidhom and myself) have been working on productionizing the prototype to
>>>> address its limitations and do things "the right way", preparing to reuse
>>>> this work for developing other portable runners (e.g. Spark). This involves
>>>> a surprising amount of work, since many important design and implementation
>>>> concerns could be ignored for the purposes of a prototype. I wanted to give
>>>> an update on where we stand now.
>>>>
>>>> Our immediate milestone in sight is *Run Java and Python batch
>>>> WordCount examples against a distributed remote Flink cluster*. That
>>>> involves a few moving parts, roughly in order of appearance:
>>>>
>>>> *Job submission:*
>>>> - The SDK is configured to use a "portable runner", whose
>>>> responsibility is to run the pipeline against a given JobService endpoint.
>>>> - The portable runner converts the pipeline to a portable Pipeline proto
>>>> - The runner finds out which artifacts it needs to stage, and staging
>>>> them against an ArtifactStagingService
>>>> - A Flink-specific JobService receives the Pipeline proto, performs
>>>> some optimizations (e.g. fusion) and translates it to Flink datasets and
>>>> functions
>>>>
>>>> *Job execution:*
>>>> - A Flink function executes a fused chain of Beam transforms (an
>>>> "executable stage") by converting the input and the stage to bundles and
>>>> executing them against an SDK harness
>>>> - The function starts the proper SDK harness, auxiliary services (e.g.
>>>> artifact retrieval, side input handling) and wires them together
>>>> - The function feeds the data to the harness and receives data back.
>>>>
>>>> *And here is our status of implementation for these parts:* basically,
>>>> almost everything is either done or in review.
>>>>
>>>> *Job submission:*
>>>> - General-purpose portable runner in the Python SDK: done
>>>> <https://github.com/apache/beam/pull/5301>; Java SDK: also done
>>>> <https://github.com/apache/beam/pull/5150>
>>>> - Artifact staging from the Python SDK: in review (PR
>>>> <https://github.com/apache/beam/pull/5273>, PR
>>>> <https://github.com/apache/beam/pull/5251>); in java, it's done also
>>>> - Flink JobService: in review
>>>> <https://github.com/apache/beam/pull/5262>
>>>> - Translation from a Pipeline proto to Flink datasets and functions:
>>>> done <https://github.com/apache/beam/pull/5226>
>>>> - ArtifactStagingService implementation that stages artifacts to a
>>>> location on a distributed filesystem: in development (design is clear)
>>>>
>>>> *Job execution:*
>>>> - Flink function for executing via an SDK harness: done
>>>> <https://github.com/apache/beam/pull/5285>
>>>> - APIs for managing lifecycle of an SDK harness: done
>>>> <https://github.com/apache/beam/pull/5152>
>>>> - Specific implementation of those APIs using Docker: part done
>>>> <https://github.com/apache/beam/pull/5189>, part in review
>>>> <https://github.com/apache/beam/pull/5392>
>>>> - ArtifactRetrievalService that retrieves artifacts from the location
>>>> where ArtifactStagingService staged them: in development.
>>>>
>>>> We expect that the in-review parts will be done, and the in-development
>>>> parts be developed, in the next 2-3 weeks. We will, of course, update the
>>>> community when this important milestone is reached.
>>>>
>>>> *After that, the next milestones include:*
>>>> - Sett up Java, Python and Go ValidatesRunner tests to run against the
>>>> portable Flink runner, and get them to pass
>>>> - Expand Python and Go to parity in terms of such test coverage
>>>> - Implement the portable Spark runner, with a similar lifecycle but
>>>> reusing almost all of the Flink work
>>>> - Add support for streaming to both (which requires SDF - that work is
>>>> progressing in parallel and by this point should be in a suitable place)
>>>>
>>>> *For people who would like to get involved in this effort: *You can
>>>> already help out by improving ValidatesRunner test coverage in Python and
>>>> Go. Java has >300 such tests, Python has only a handful. There'll be a
>>>> large amount of parallelizable work once we get the VR test suites running
>>>> - stay tuned. SDF+Portability is also expected to produce a lot of
>>>> parallelizable work up for grabs within several weeks.
>>>>
>>>> Thanks!
>>>>
>>>
>>
>

Re: Current progress on Portable runners

Reply via email to