Hi all,
I've run into a situation where I would like to return two PCollections
during a PTransform. I am aware of the ParDo.with_outputs construct but in
this case, the PCollections are the flattened results of several other
transforms and it would be cleaner to just return multiple PCollections in
a tuple.
I've tested this out with the following snippet and it seems to work (at
least on the direct runner):
---
import apache_beam as beam
@beam.ptransform_fn
def test(pcoll):
a = pcoll | '1' >> beam.Map(lambda x: x+1)
b = pcoll | '2' >> beam.Map(lambda x: x+10)
return (a,b)
with beam.Pipeline() as p:
c = p | beam.Create(list(range(10)))
a,b = c | test()
a | 'a' >> beam.Map(lambda x: print('a %d' % x))
b | 'b' >> beam.Map(lambda x: print('b %d' % x))
---
I'm curious if this type of pipeline construction is well-supported and if
I will run into any issues on other runners.
Thanks!
- Harrison