Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Robert Bradshaw
You can literally return a Python tuple of outputs from a composite
transform as well. (Dicts with PCollections as values are also
supported, if you want things to be named rather than referenced by
index.)

On Fri, Oct 25, 2019 at 4:06 PM Ahmet Altay  wrote:
>
> Is DoOutputsTuple what you are looking for? [1] You can look at this expand 
> function using it [2].
>
> [1] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L204
> [2] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L1283
>
> On Fri, Oct 25, 2019 at 3:51 PM Luke Cwik  wrote:
>>
>> My example is about multiple inputs and not multiple outputs from further 
>> investigation it seems as I don't know.
>>
>> Looking at the documentation online[1] doesn't seem to specify how to do 
>> this either for composite transforms. All the examples are of the single 
>> output variety as well[2].
>>
>> 1: 
>> https://beam.apache.org/documentation/programming-guide/#composite-transforms
>> 2: 
>> https://github.com/apache/beam/blob/4ba731fe93f7f8385c771caf576745d14edf34b8/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
>>
>> On Fri, Oct 25, 2019 at 10:24 AM Luke Cwik  wrote:
>>>
>>> I believe PCollectionTuple should be unnecessary since Python has first 
>>> class support for tuples as shown in the example below[1]. Can we use 
>>> tuples to solve your issue?
>>>
>>> wordsStartingWithA = \
>>> p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])
>>>
>>> wordsStartingWithB = \
>>> p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])
>>>
>>> ((wordsStartingWithA, wordsStartingWithB)
>>> | beam.Flatten()
>>> | LogElements())
>>>
>>> 1: 
>>> https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29
>>>
>>> On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:

 Talked to Daniel offline and it looks like the Python SDK is missing 
 PCollection Tuples like the one Java has: 
 https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java.

 I'll go ahead and implement that for the Python SDK.

 On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>
> Hey All,
>
> I'm trying to implement an expand override with multiple output 
> PCollections. The kicker is that I want to insert a new transform for 
> each output PCollection. How can I do this?
>
> Regards,
> Sam


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Ahmet Altay
Is DoOutputsTuple what you are looking for? [1] You can look at this expand
function using it [2].

[1]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L204
[2]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L1283

On Fri, Oct 25, 2019 at 3:51 PM Luke Cwik  wrote:

> My example is about multiple inputs and not multiple outputs from further
> investigation it seems as I don't know.
>
> Looking at the documentation online[1] doesn't seem to specify how to do
> this either for composite transforms. All the examples are of the single
> output variety as well[2].
>
> 1:
> https://beam.apache.org/documentation/programming-guide/#composite-transforms
> 2:
> https://github.com/apache/beam/blob/4ba731fe93f7f8385c771caf576745d14edf34b8/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
>
> On Fri, Oct 25, 2019 at 10:24 AM Luke Cwik  wrote:
>
>> I believe PCollectionTuple should be unnecessary since Python has first
>> class support for tuples as shown in the example below[1]. Can we use
>> tuples to solve your issue?
>>
>> wordsStartingWithA = \
>> p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])
>>
>> wordsStartingWithB = \
>> p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])
>>
>> ((wordsStartingWithA, wordsStartingWithB)
>> | beam.Flatten()
>> | LogElements())
>>
>> 1:
>> https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29
>>
>> On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:
>>
>>> Talked to Daniel offline and it looks like the Python SDK is missing
>>> PCollection Tuples like the one Java has:
>>> https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
>>> .
>>>
>>> I'll go ahead and implement that for the Python SDK.
>>>
>>> On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>>>
 Hey All,

 I'm trying to implement an expand override with multiple output
 PCollections. The kicker is that I want to insert a new transform for each
 output PCollection. How can I do this?

 Regards,
 Sam

>>>


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Luke Cwik
I believe PCollectionTuple should be unnecessary since Python has first
class support for tuples as shown in the example below[1]. Can we use
tuples to solve your issue?

wordsStartingWithA = \
p | 'Words starting with A' >> beam.Create(['apple', 'ant', 'arrow'])

wordsStartingWithB = \
p | 'Words starting with B' >> beam.Create(['ball', 'book', 'bow'])

((wordsStartingWithA, wordsStartingWithB)
| beam.Flatten()
| LogElements())

1:
https://github.com/apache/beam/blob/238659bce8043e6a64619a959ab44453dbe22dff/learning/katas/python/Core%20Transforms/Flatten/Flatten/task.py#L29

On Fri, Oct 25, 2019 at 10:11 AM Sam Rohde  wrote:

> Talked to Daniel offline and it looks like the Python SDK is missing
> PCollection Tuples like the one Java has:
> https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
> .
>
> I'll go ahead and implement that for the Python SDK.
>
> On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:
>
>> Hey All,
>>
>> I'm trying to implement an expand override with multiple output
>> PCollections. The kicker is that I want to insert a new transform for each
>> output PCollection. How can I do this?
>>
>> Regards,
>> Sam
>>
>


Re: Multiple Outputs from Expand in Python

2019-10-25 Thread Sam Rohde
Talked to Daniel offline and it looks like the Python SDK is missing
PCollection Tuples like the one Java has:
https://github.com/rohdesamuel/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
.

I'll go ahead and implement that for the Python SDK.

On Thu, Oct 24, 2019 at 5:20 PM Sam Rohde  wrote:

> Hey All,
>
> I'm trying to implement an expand override with multiple output
> PCollections. The kicker is that I want to insert a new transform for each
> output PCollection. How can I do this?
>
> Regards,
> Sam
>


Multiple Outputs from Expand in Python

2019-10-24 Thread Sam Rohde
Hey All,

I'm trying to implement an expand override with multiple output
PCollections. The kicker is that I want to insert a new transform for each
output PCollection. How can I do this?

Regards,
Sam