Hi,

Currently the portable Flink runner only works with SDK Docker containers
for execution (DockerJobBundleFactory, besides an in-process (embedded)
factory option for testing [1]). I'm considering adding another out of
process JobBundleFactory implementation that directly forks the processes
on the task manager host, eliminating the need for Docker. This would work
reasonably well in environments where the dependencies (in this case
Python) can easily be tied into the host deployment (also within an
application specific Kubernetes pod).

There was already some discussion about alternative JobBundleFactory
implementation in [2]. There is also a JIRA to make the bundle factory
pluggable [3], pending availability of runner level options.

For a "ProcessBundleFactory", in addition to the Python dependencies the
environment would also need to have the Go boot executable [4] (or a
substitute thereof) to perform the harness initialization.

Is anyone else interested in this SDK execution option or has already
investigated an alternative implementation?

Thanks,
Thomas

[1]
https://github.com/apache/beam/blob/7958a379b0a37a89edc3a6ae4b5bc82fda41fcd6/runners/flink/src/test/java/org/apache/beam/runners/flink/PortableExecutionTest.java#L83

[2]
https://lists.apache.org/thread.html/d6b6fde764796de31996db9bb5f9de3e7aaf0ab29b99d0adb52ac508@%3Cdev.beam.apache.org%3E

[3] https://issues.apache.org/jira/browse/BEAM-4819

[4] https://github.com/apache/beam/blob/master/sdks/python/container/boot.go

Reply via email to