Great, this is https://issues.apache.org/jira/browse/BEAM-2049 via
https://github.com/apache/beam/pull/2636.

On Fri, Apr 21, 2017 at 11:40 PM, Pei HE <p...@apache.org> wrote:

> +1
>
> On Sat, Apr 22, 2017 at 12:16 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > +1
> >
> > Regards
> > JB
> >
> >
> > On 04/21/2017 07:24 PM, Kenneth Knowles wrote:
> >
> >> Hi all,
> >>
> >> I propose that we remove KeyedCombineFn before the first stable release.
> >>
> >> I don't think it adds enough value for the complexity it adds to e.g.
> >> CombineWithContext [1] and state [2, 3], and it doesn't seem to me that
> >> users really use it when we might expect. I am happy to be demonstrated
> >> wrong.
> >>
> >> It is very likely that you have never written [4, 5] or thought about
> >> KeyedCombineFn. So for context, here are excepts from signatures just to
> >> show the difference from CombineFn:
> >>
> >> CombineFn<InputT, AccumT, OutputT> {
> >>   AccumT createAccumulator();
> >>   AccumT addInput(AccumT accum, InputT input);
> >>   AccumT mergeAccumulators(Iterable<AccumT> accums);
> >>   OutputT extractOutput(AccumT accum);
> >> }
> >>
> >> KeyedCombineFn<K, InputT, AccumT, OutputT> {
> >>   AccumT createAccumulator(K key);
> >>   AccumT addInput(K key, AccumT accum, InputT input);
> >>   AccumT mergeAccumulators(K key, Iterable<AccumT> accums);
> >>   OutputT extractOutput(K key, AccumT accum);
> >> }
> >>
> >> So what are the particular reasons for this, versus a CombineFn that has
> >> KVs as its input and accumulator types?
> >>
> >>  - There are some performance improvements potentially from not passing
> >> keys around, based on the assumption they are always available.
> >>
> >>  - There is also a spec difference because it only has to be associative
> >> and commutative per key, cannot be applied in a global combine, and
> >> addInput is automatically key preserving.
> >>
> >> But in fact, in all of my code crawling the class is almost never used
> >> (even over the course of its history at Google) and even the few uses I
> >> found were often mistakes where the key is totally ignored, probably
> >> because a user thinks "I am doing a keyed combine so I need a keyed
> >> combine
> >> function". So the number of users actually affected is about zero.
> >>
> >> I would be curious if anyone has a compelling case for keeping
> >> KeyedCombineFn.
> >>
> >> Kenn
> >>
> >> [1]
> >> https://github.com/yafengguo/Apache-beam/blob/master/sdks/ja
> >> va/core/src/main/java/org/apache/beam/sdk/transforms/Combine
> >> WithContext.java
> >> [2] https://issues.apache.org/jira/browse/BEAM-1336
> >> [3] https://github.com/apache/beam/pull/2627
> >> [4]
> >> https://github.com/search?l=Java&q=KeyedCombineFn&ref=advsea
> >> rch&type=Code&utf8=%E2%9C%93
> >> [5] https://www.google.com/search?q=KeyedCombineFn
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Reply via email to