Re: force the coder for a pardo

Romain Manni-Bucau Tue, 20 Feb 2018 08:44:49 -0800

+1

if nobody beats me at it I can have a look next week probably - but happy
to let somebody else have fun too ;)



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-20 16:12 GMT+01:00 Jean-Baptiste Onofré <j...@nanthrax.net>:

> Agree. It makes sense to me in order to be consistent between the
> PTransforms.
>
> Regards
> JB
> Le 20 févr. 2018, à 16:08, Eugene Kirpichov <kirpic...@google.com> a
> écrit:
>>
>> Something similar was discussed a while ago, and lead to the suggestion
>> in PTransform Style Guide:
>> https://beam.apache.org/contribute/ptransform-style-
>> guide/#setting-coders-on-output-collections
>>
>> This suggestion is currently not followed by ParDo, but your plan moves
>> in that direction, so +1 to that.
>> Remembering that ParDo's may have multiple outputs, though, I'd suggest
>> to organize it using builder methods:
>> ParDo.of(new MyFn())
>>   .withOutputTags(...)
>>   .withCoder(...coder for main output...)
>>   .withCoder(tag1, coder1)
>>   .withCoder(tag2, coder2)
>>
>> This would bring ParDo to be similar to all other transforms that allow
>> specifying a coder.
>>
>> On Tue, Feb 20, 2018 at 3:41 AM Jean-Baptiste Onofré < j...@nanthrax.net>
>> wrote:
>>
>>> Got the point.
>>>
>>> No problem for me. Could be a new core PTransform in core or extension.
>>>
>>> Regards
>>> JB
>>> Le 20 févr. 2018, à 11:17, Romain Manni-Bucau < rmannibu...@gmail.com>
>>> a écrit:
>>>>
>>>> Yep, idea is to encapsulate the transfo. Today if you dev a dofn you
>>>> are enforced to do a ptransform to force the coder which is a bit overkill
>>>> in general. Being able to do it on the pardo would increase the
>>>> composability
>>>> on the user side and reduce the boilerplate needed for that.
>>>>
>>>> public class MyFn extends DoFn<A, B> {
>>>>
>>>>   // ...impl
>>>>
>>>>  public static PTransform<PCollection<A>, PCollection<B>> of(*...*) {
>>>>       return ParDo.of(MyCoder.of(), new MyFn());
>>>>   }
>>>>
>>>> }
>>>>
>>>> Instead of having to also impl in all fn library:
>>>>
>>>> @NoArgsConstructor(access = PROTECTED)
>>>> @AllArgsConstructor
>>>> class ParDoTransformCoderProvider<A, B> extends PTransform<PCollection<A>, 
>>>> PCollection<B>> {
>>>>
>>>>     private Coder<B> coder;
>>>>
>>>>     private DoFn<A, B> fn;
>>>>
>>>>     @Override
>>>>     public PCollection<B> expand(final PCollection<A> input) {
>>>>         return input.apply(ParDo.of(fn)).setCoder(coder);
>>>>     }
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> |  Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-20 11:03 GMT+01:00 Jean-Baptiste Onofré <j...@nanthrax.net>:
>>>>
>>>>> Not on the PCollection ? Only ParDo ?
>>>>> Le 20 févr. 2018, à 10:50, Romain Manni-Bucau < rmannibu...@gmail.com>
>>>>> a écrit:
>>>>>>
>>>>>> Hi guys,
>>>>>>
>>>>>> any objection to allow to pass with the pardo a coder? Idea is to
>>>>>> avoid to have to write your own transform to be able to configure the 
>>>>>> coder
>>>>>> when you start from a dofn and just do something like
>>>>>>
>>>>>> ParDo.of(new MyFn(), new MyCoder()) which is directly integrable into
>>>>>> a pipeline properly.
>>>>>>
>>>>>> wdyt?
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>
>>>>

Re: force the coder for a pardo

Reply via email to