Re: Returning multiple PCollections from a PTransform

Harrison Green Wed, 05 Aug 2020 14:24:08 -0700

Awesome!

Is it possible to use beam type hinting in this scenario? For example,
could I explicitly annotate returning a tuple of PCollections with
something like @beam.typehints.with_output_types?


Thanks,
Harrison

On 2020/08/05 00:03:25, Robert Bradshaw <[email protected]> wrote:
> Yes, this is explicitly supported. You can return named tuples and>
> dictionaries (with PCollections as values) as well.>
>
> On Tue, Aug 4, 2020 at 5:00 PM Harrison Green <[email protected]> wrote:>
> >>
> > Hi all,>
> >>
> > I've run into a situation where I would like to return two PCollections
during a PTransform. I am aware of the ParDo.with_outputs construct but in
this case, the PCollections are the flattened results of several other
transforms and it would be cleaner to just return multiple PCollections in
a tuple.>
> >>
> > I've tested this out with the following snippet and it seems to work
(at least on the direct runner):>
> >>
> > --->
> > import apache_beam as beam>
> >>
> > @beam.ptransform_fn>
> > def test(pcoll):>
> >     a = pcoll | '1' >> beam.Map(lambda x: x+1)>
> >     b = pcoll | '2' >> beam.Map(lambda x: x+10)>
> >>
> >     return (a,b)>
> >>
> > with beam.Pipeline() as p:>
> >     c = p | beam.Create(list(range(10)))>
> >>
> >     a,b = c | test()>
> >>
> >     a | 'a' >> beam.Map(lambda x: print('a %d' % x))>
> >     b | 'b' >> beam.Map(lambda x: print('b %d' % x))>
> > --->
> >>
> > I'm curious if this type of pipeline construction is well-supported and
if I will run into any issues on other runners.>
> >>
> > Thanks!>
> > - Harrison>
>

Re: Returning multiple PCollections from a PTransform

Reply via email to