There are essentially 2 complementary portability API surfaces that you'd need to implement: job management incl. job submission and execution as well as some worker deployment plumbing specific to the runner. Note that the source of truth is the model protos -- the design docs linked from https://beam.apache.org/contribute/portability/ and (even more so) the website guides are not always up to date.
Currently, all runners are in Java and share numerous components and utilities. A non-JVM runner would have to build all that from scratch -- although, as you mention, if you're using Go or Python the corresponding SDKs likely have many pieces that can be reused. A minor potential hiccup is that gRPC/protobuf is not natively supported everywhere, so you may end up interoperating with the C versions of the libraries if you pick a non-supported language. A separate challenge regardless of the language is how directly the Beam model and primitives map to the engine. All that said, I think it's definitely feasible to do something interesting. Are you specifically thinking of a Go Wallaroo runner? Thanks, Henning On Tue, Jul 17, 2018 at 9:26 AM Austin Bennett <whatwouldausti...@gmail.com> wrote: > Sweet; that led me to: > https://beam.apache.org/contribute/runner-guide/#the-runner-api (which I > can't believe I missed). > > > > On Tue, Jul 17, 2018 at 9:21 AM, Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> Hi Austin, >> >> If your runner provide the gRPC portabality layer (allowing any SDK to >> "interact" with the runner), it will work no matter how the runner is >> implemented (JVM or not). >> >> However, it means that you will have to mimic the Runner API for the >> translation. >> >> Regards >> JB >> >> On 17/07/2018 18:19, Austin Bennett wrote: >> > Hi Beam Devs, >> > >> > I still don't quite understand: >> > >> > "Apache Beam provides a portable API layer for building sophisticated >> > data-parallel processing pipelines that may be executed across a >> > diversity of execution engines, or /runners/." >> > >> > (from https://beam.apache.org/documentation/runners/capability-matrix/) >> > >> > And specifically, close reading >> > of: https://beam.apache.org/contribute/portability/ >> > >> > What if I'd like to implement a runner that is non-JVM? Though would >> > leverage the Python and Go SDKs? Specifically, thinking of: >> > https://www.wallaroolabs.com (I am out in NY meeting with friends >> there >> > later this week, and wanted to get a sense of, feasibility, work >> > involved, etc -- to propose that we add a new Wallaroo runner). >> > >> > Is there a way to keep java out of the mix completely and still work >> > with Beam on a non JVM runner (seems maybe eventually, but what about >> > currently/near future)? >> > >> > Any input, thoughts, ideas, other pages or info to explore -- all >> > appreciated; thanks! >> > Austin >> > >> > >> >> -- >> Jean-Baptiste Onofré >> jbono...@apache.org >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> > >