+1 On Fri, Apr 21, 2017 at 10:53 AM Thomas Groh <tg...@google.com.invalid> wrote:
> A happy +1. This simplifies the code base, and if we find a compelling use, > it shouldn't be too bad to add it back in. > > On Fri, Apr 21, 2017 at 10:24 AM, Kenneth Knowles <k...@google.com.invalid> > wrote: > > > Hi all, > > > > I propose that we remove KeyedCombineFn before the first stable release. > > > > I don't think it adds enough value for the complexity it adds to e.g. > > CombineWithContext [1] and state [2, 3], and it doesn't seem to me that > > users really use it when we might expect. I am happy to be demonstrated > > wrong. > > > > It is very likely that you have never written [4, 5] or thought about > > KeyedCombineFn. So for context, here are excepts from signatures just to > > show the difference from CombineFn: > > > > CombineFn<InputT, AccumT, OutputT> { > > AccumT createAccumulator(); > > AccumT addInput(AccumT accum, InputT input); > > AccumT mergeAccumulators(Iterable<AccumT> accums); > > OutputT extractOutput(AccumT accum); > > } > > > > KeyedCombineFn<K, InputT, AccumT, OutputT> { > > AccumT createAccumulator(K key); > > AccumT addInput(K key, AccumT accum, InputT input); > > AccumT mergeAccumulators(K key, Iterable<AccumT> accums); > > OutputT extractOutput(K key, AccumT accum); > > } > > > > So what are the particular reasons for this, versus a CombineFn that has > > KVs as its input and accumulator types? > > > > - There are some performance improvements potentially from not passing > > keys around, based on the assumption they are always available. > > > > - There is also a spec difference because it only has to be associative > > and commutative per key, cannot be applied in a global combine, and > > addInput is automatically key preserving. > > > > But in fact, in all of my code crawling the class is almost never used > > (even over the course of its history at Google) and even the few uses I > > found were often mistakes where the key is totally ignored, probably > > because a user thinks "I am doing a keyed combine so I need a keyed > combine > > function". So the number of users actually affected is about zero. > > > > I would be curious if anyone has a compelling case for keeping > > KeyedCombineFn. > > > > Kenn > > > > [1] > > https://github.com/yafengguo/Apache-beam/blob/master/sdks/ > > java/core/src/main/java/org/apache/beam/sdk/transforms/ > > CombineWithContext.java > > [2] https://issues.apache.org/jira/browse/BEAM-1336 > > [3] https://github.com/apache/beam/pull/2627 > > [4] > > https://github.com/search?l=Java&q=KeyedCombineFn&ref= > > advsearch&type=Code&utf8=%E2%9C%93 > > [5] https://www.google.com/search?q=KeyedCombineFn > > >