There are essentially 2 complementary portability API surfaces that you'd
need to implement: job management incl. job submission and execution as
well as some worker deployment plumbing specific to the runner. Note that
the source of truth is the model protos -- the design docs linked from
https://beam.apache.org/contribute/portability/ and (even more so) the
website guides are not always up to date.

Currently, all runners are in Java and share numerous components and
utilities. A non-JVM runner would have to build all that from scratch --
although, as you mention, if you're using Go or Python the corresponding
SDKs likely have many pieces that can be reused. A minor potential hiccup
is that gRPC/protobuf is not natively supported everywhere, so you may end
up interoperating with the C versions of the libraries if you pick a
non-supported language. A separate challenge regardless of the language is
how directly the Beam model and primitives map to the engine.

All that said, I think it's definitely feasible to do something
interesting. Are you specifically thinking of a Go Wallaroo runner?

Thanks,
 Henning

On Tue, Jul 17, 2018 at 9:26 AM Austin Bennett <whatwouldausti...@gmail.com>
wrote:

> Sweet; that led me to:
> https://beam.apache.org/contribute/runner-guide/#the-runner-api (which I
> can't believe I missed).
>
>
>
> On Tue, Jul 17, 2018 at 9:21 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi Austin,
>>
>> If your runner provide the gRPC portabality layer (allowing any SDK to
>> "interact" with the runner), it will work no matter how the runner is
>> implemented (JVM or not).
>>
>> However, it means that you will have to mimic the Runner API for the
>> translation.
>>
>> Regards
>> JB
>>
>> On 17/07/2018 18:19, Austin Bennett wrote:
>> > Hi Beam Devs,
>> >
>> > I still don't quite understand:
>> >
>> > "Apache Beam provides a portable API layer for building sophisticated
>> > data-parallel processing pipelines that may be executed across a
>> > diversity of execution engines, or /runners/."
>> >
>> > (from https://beam.apache.org/documentation/runners/capability-matrix/)
>> >
>> > And specifically, close reading
>> > of: https://beam.apache.org/contribute/portability/
>> >
>> > What if I'd like to implement a runner that is non-JVM?  Though would
>> > leverage the Python and Go SDKs?  Specifically, thinking of:
>> >  https://www.wallaroolabs.com (I am out in NY meeting with friends
>> there
>> > later this week, and wanted to get a sense of, feasibility, work
>> > involved, etc -- to propose that we add a new Wallaroo runner).
>> >
>> > Is there a way to keep java out of the mix completely and still work
>> > with Beam on a non JVM runner (seems maybe eventually, but what about
>> > currently/near future)?
>> >
>> > Any input, thoughts, ideas, other pages or info to explore -- all
>> > appreciated; thanks!
>> > Austin
>> >
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>
>

Reply via email to