Now that we have the FnAPI, I started playing around with support for
cross-language pipelines. This will allow things like IOs to be shared
across all languages, SQL to be invoked from non-Java, TFX tensorflow
transforms to be invoked from non-Python, etc. and I think is the next
step in extending (and taking advantage of) the portability layer
we've developed. These are often composite transforms whose inner
structure depends in non-trivial ways on their configuration.

I created a PR [1] that basically follows the "expand via an external
process" over RPC alternative from the proposals we came up with when
we were discussing this last time [2]. There are still some unknowns,
e.g. how to handle artifacts supplied by an alternative SDK (they
currently must be provided by the environment), but I think this is a
good incremental step forward that will already be useful in a large
number of cases. It would be good to validate the general direction
and I would be interested in any feedback others may have on it.

- Robert

[1] https://github.com/apache/beam/pull/7316
[2] https://s.apache.org/beam-mixed-language-pipelines

Reply via email to