Re: Multiple Outputs from Expand in Python

2019-10-28 Thread Sam Rohde
Ahmet:
That's close to what I want, however, DoOutputTuples don't allow for
setting PCollections manually.

Robert:
Great! I didn't see that documented anywhere. I'll try this out.

I will modify the pipeline replacement method to enable multiple outputs
too. Are there any "gotchas" when modifying this code?

On Fri, Oct 25, 2019 at 4:16 PM Robert Bradshaw  wrote:

> You can literally return a Python tuple of outputs from a composite
> transform as well. (Dicts with PCollections as values are also
> supported, if you want things to be named rather than referenced by
> index.)
>
> On Fri, Oct 25, 2019 at 4:06 PM Ahmet Altay  wrote:
> >
> > Is DoOutputsTuple what you are looking for? [1] You can look at this
> expand function using it [2].
> >
> > [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L204
> > [2]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L1283
> >
> > On Fri, Oct 25, 2019 at 3:51 PM Luke Cwik  wrote:
> >>
> >> My example is about multiple inputs and not multiple outputs from
> further investigation it seems as I don't know.
> >>
> >> Looking at the documentation online[1] doesn't seem to specify how to
> do this either for composite transforms. All the examples are of the single
> output variety as well[2].
> >>
> >> 1:
> https://beam.apache.org/documentation/programming-guide/#composite-transforms
> >> 2:
> https://github.com/apache/beam/blob/4ba731fe93f7f8385c771caf576745d14edf34b8/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
> >>
> >> On Fri, Oct 25, 2019 at 10:24 AM Luke Cwik  wrote:
> >>>
> >>> I believe PCollectionTuple should be unnecessary since Python has
> first class support for tuples as shown in the example below[1]. Can we use
> tuples to solve your issue?
> >>>
> >>> wordsStartingWithA = \
> >>> p | 'Words starting with A' >> beam.Create(['apple', 'ant',
> 'arrow'])
> >>>
> >>> wordsStartingWithB = \
> >>> p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])
> >>>
> >>> ((wordsStartingWithA, wordsStartingWithB)
> >>> | beam.Flatten()
> >>> | LogElements())
> >>>
> >>> 1:
> https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29
> >>>
> >>> On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:
> 
>  Talked to Daniel offline and it looks like the Python SDK is missing
> PCollection Tuples like the one Java has:
> https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
> .
> 
>  I'll go ahead and implement that for the Python SDK.
> 
>  On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
> >
> > Hey All,
> >
> > I'm trying to implement an expand override with multiple output
> PCollections. The kicker is that I want to insert a new transform for each
> output PCollection. How can I do this?
> >
> > Regards,
> > Sam
>


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Robert Bradshaw
You can literally return a Python tuple of outputs from a composite
transform as well. (Dicts with PCollections as values are also
supported, if you want things to be named rather than referenced by
index.)

On Fri, Oct 25, 2019 at 4:06 PM Ahmet Altay  wrote:
>
> Is DoOutputsTuple what you are looking for? [1] You can look at this expand 
> function using it [2].
>
> [1] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L204
> [2] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L1283
>
> On Fri, Oct 25, 2019 at 3:51 PM Luke Cwik  wrote:
>>
>> My example is about multiple inputs and not multiple outputs from further 
>> investigation it seems as I don't know.
>>
>> Looking at the documentation online[1] doesn't seem to specify how to do 
>> this either for composite transforms. All the examples are of the single 
>> output variety as well[2].
>>
>> 1: 
>> https://beam.apache.org/documentation/programming-guide/#composite-transforms
>> 2: 
>> https://github.com/apache/beam/blob/4ba731fe93f7f8385c771caf576745d14edf34b8/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
>>
>> On Fri, Oct 25, 2019 at 10:24 AM Luke Cwik  wrote:
>>>
>>> I believe PCollectionTuple should be unnecessary since Python has first 
>>> class support for tuples as shown in the example below[1]. Can we use 
>>> tuples to solve your issue?
>>>
>>> wordsStartingWithA = \
>>> p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])
>>>
>>> wordsStartingWithB = \
>>> p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])
>>>
>>> ((wordsStartingWithA, wordsStartingWithB)
>>> | beam.Flatten()
>>> | LogElements())
>>>
>>> 1: 
>>> https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29
>>>
>>> On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:

 Talked to Daniel offline and it looks like the Python SDK is missing 
 PCollection Tuples like the one Java has: 
 https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java.

 I'll go ahead and implement that for the Python SDK.

 On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>
> Hey All,
>
> I'm trying to implement an expand override with multiple output 
> PCollections. The kicker is that I want to insert a new transform for 
> each output PCollection. How can I do this?
>
> Regards,
> Sam


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Ahmet Altay
Is DoOutputsTuple what you are looking for? [1] You can look at this expand
function using it [2].

[1]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L204
[2]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L1283

On Fri, Oct 25, 2019 at 3:51 PM Luke Cwik  wrote:

> My example is about multiple inputs and not multiple outputs from further
> investigation it seems as I don't know.
>
> Looking at the documentation online[1] doesn't seem to specify how to do
> this either for composite transforms. All the examples are of the single
> output variety as well[2].
>
> 1:
> https://beam.apache.org/documentation/programming-guide/#composite-transforms
> 2:
> https://github.com/apache/beam/blob/4ba731fe93f7f8385c771caf576745d14edf34b8/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
>
> On Fri, Oct 25, 2019 at 10:24 AM Luke Cwik  wrote:
>
>> I believe PCollectionTuple should be unnecessary since Python has first
>> class support for tuples as shown in the example below[1]. Can we use
>> tuples to solve your issue?
>>
>> wordsStartingWithA = \
>> p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])
>>
>> wordsStartingWithB = \
>> p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])
>>
>> ((wordsStartingWithA, wordsStartingWithB)
>> | beam.Flatten()
>> | LogElements())
>>
>> 1:
>> https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29
>>
>> On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:
>>
>>> Talked to Daniel offline and it looks like the Python SDK is missing
>>> PCollection Tuples like the one Java has:
>>> https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
>>> .
>>>
>>> I'll go ahead and implement that for the Python SDK.
>>>
>>> On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>>>
 Hey All,

 I'm trying to implement an expand override with multiple output
 PCollections. The kicker is that I want to insert a new transform for each
 output PCollection. How can I do this?

 Regards,
 Sam

>>>


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Luke Cwik
My example is about multiple inputs and not multiple outputs from further
investigation it seems as I don't know.

Looking at the documentation online[1] doesn't seem to specify how to do
this either for composite transforms. All the examples are of the single
output variety as well[2].

1:
https://beam.apache.org/documentation/programming-guide/#composite-transforms
2:
https://github.com/apache/beam/blob/4ba731fe93f7f8385c771caf576745d14edf34b8/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py

On Fri, Oct 25, 2019 at 10:24 AM Luke Cwik  wrote:

> I believe PCollectionTuple should be unnecessary since Python has first
> class support for tuples as shown in the example below[1]. Can we use
> tuples to solve your issue?
>
> wordsStartingWithA = \
> p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])
>
> wordsStartingWithB = \
> p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])
>
> ((wordsStartingWithA, wordsStartingWithB)
> | beam.Flatten()
> | LogElements())
>
> 1:
> https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29
>
> On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:
>
>> Talked to Daniel offline and it looks like the Python SDK is missing
>> PCollection Tuples like the one Java has:
>> https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
>> .
>>
>> I'll go ahead and implement that for the Python SDK.
>>
>> On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>>
>>> Hey All,
>>>
>>> I'm trying to implement an expand override with multiple output
>>> PCollections. The kicker is that I want to insert a new transform for each
>>> output PCollection. How can I do this?
>>>
>>> Regards,
>>> Sam
>>>
>>


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Luke Cwik
I believe PCollectionTuple should be unnecessary since Python has first
class support for tuples as shown in the example below[1]. Can we use
tuples to solve your issue?

wordsStartingWithA = \
p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])

wordsStartingWithB = \
p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])

((wordsStartingWithA, wordsStartingWithB)
| beam.Flatten()
| LogElements())

1:
https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29

On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:

> Talked to Daniel offline and it looks like the Python SDK is missing
> PCollection Tuples like the one Java has:
> https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
> .
>
> I'll go ahead and implement that for the Python SDK.
>
> On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>
>> Hey All,
>>
>> I'm trying to implement an expand override with multiple output
>> PCollections. The kicker is that I want to insert a new transform for each
>> output PCollection. How can I do this?
>>
>> Regards,
>> Sam
>>
>


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Sam Rohde
Talked to Daniel offline and it looks like the Python SDK is missing
PCollection Tuples like the one Java has:
https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
.

I'll go ahead and implement that for the Python SDK.

On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:

> Hey All,
>
> I'm trying to implement an expand override with multiple output
> PCollections. The kicker is that I want to insert a new transform for each
> output PCollection. How can I do this?
>
> Regards,
> Sam
>


Multiple Outputs from Expand in Python

2019-10-24 Thread Sam Rohde
Hey All,

I'm trying to implement an expand override with multiple output
PCollections. The kicker is that I want to insert a new transform for each
output PCollection. How can I do this?

Regards,
Sam