On Mon, Jun 8, 2020 at 12:57 PM Chad Dombrova <chad...@gmail.com> wrote:
> Hi all, > quick followup question: > > >> small correction. While the new runner will be available with Beam 2.21, >>> the Cross-Language support will be available in 2.22. >>> There will be limitations in the initial set of connectors you can use >>> with Cross-Lang. But at least you will have something to test with, >>> starting in 2.22 >>> >> >> To clarify, we're not actually prohibiting any other >> cross-langauge transforms being used, but Kafka is the only one that'll be >> extensively tested and supported at this time. >> > > We're currently using the Flink runner with external Java PubSubIO > transforms in our python pipelines because there is no pure python option. > In its non-portable past, Dataflow has had its own native implementation > of PubSubIO, that got switched out at runtime, so there was no need to use > external transforms there. What's the story around PubSubIO when using > Dataflow + portability? If we were to switch from Flink to Dataflow, would > we continue to use external Java PubSubIO transforms, or is there still > some special treatment of pubsub for Portable Dataflow? > Even when running portably, Dataflow still has its own implementation of PubSubIO that is switched out for Python's "implementation." (It's actually built into the same layer that provides the shuffle/group-by-key implementation.) However, if you used the external Java PubSubIO it may not recognize this and continue to use that implementation even on dataflow.