+1

On Fri, Apr 21, 2017 at 10:53 AM Thomas Groh <tg...@google.com.invalid>
wrote:

> A happy +1. This simplifies the code base, and if we find a compelling use,
> it shouldn't be too bad to add it back in.
>
> On Fri, Apr 21, 2017 at 10:24 AM, Kenneth Knowles <k...@google.com.invalid>
> wrote:
>
> > Hi all,
> >
> > I propose that we remove KeyedCombineFn before the first stable release.
> >
> > I don't think it adds enough value for the complexity it adds to e.g.
> > CombineWithContext [1] and state [2, 3], and it doesn't seem to me that
> > users really use it when we might expect. I am happy to be demonstrated
> > wrong.
> >
> > It is very likely that you have never written [4, 5] or thought about
> > KeyedCombineFn. So for context, here are excepts from signatures just to
> > show the difference from CombineFn:
> >
> > CombineFn<InputT, AccumT, OutputT> {
> >   AccumT createAccumulator();
> >   AccumT addInput(AccumT accum, InputT input);
> >   AccumT mergeAccumulators(Iterable<AccumT> accums);
> >   OutputT extractOutput(AccumT accum);
> > }
> >
> > KeyedCombineFn<K, InputT, AccumT, OutputT> {
> >   AccumT createAccumulator(K key);
> >   AccumT addInput(K key, AccumT accum, InputT input);
> >   AccumT mergeAccumulators(K key, Iterable<AccumT> accums);
> >   OutputT extractOutput(K key, AccumT accum);
> > }
> >
> > So what are the particular reasons for this, versus a CombineFn that has
> > KVs as its input and accumulator types?
> >
> >  - There are some performance improvements potentially from not passing
> > keys around, based on the assumption they are always available.
> >
> >  - There is also a spec difference because it only has to be associative
> > and commutative per key, cannot be applied in a global combine, and
> > addInput is automatically key preserving.
> >
> > But in fact, in all of my code crawling the class is almost never used
> > (even over the course of its history at Google) and even the few uses I
> > found were often mistakes where the key is totally ignored, probably
> > because a user thinks "I am doing a keyed combine so I need a keyed
> combine
> > function". So the number of users actually affected is about zero.
> >
> > I would be curious if anyone has a compelling case for keeping
> > KeyedCombineFn.
> >
> > Kenn
> >
> > [1]
> > https://github.com/yafengguo/Apache-beam/blob/master/sdks/
> > java/core/src/main/java/org/apache/beam/sdk/transforms/
> > CombineWithContext.java
> > [2] https://issues.apache.org/jira/browse/BEAM-1336
> > [3] https://github.com/apache/beam/pull/2627
> > [4]
> > https://github.com/search?l=Java&q=KeyedCombineFn&ref=
> > advsearch&type=Code&utf8=%E2%9C%93
> > [5] https://www.google.com/search?q=KeyedCombineFn
> >
>

Reply via email to