+1

Regards
JB

On 04/21/2017 07:24 PM, Kenneth Knowles wrote:
Hi all,

I propose that we remove KeyedCombineFn before the first stable release.

I don't think it adds enough value for the complexity it adds to e.g.
CombineWithContext [1] and state [2, 3], and it doesn't seem to me that
users really use it when we might expect. I am happy to be demonstrated
wrong.

It is very likely that you have never written [4, 5] or thought about
KeyedCombineFn. So for context, here are excepts from signatures just to
show the difference from CombineFn:

CombineFn<InputT, AccumT, OutputT> {
  AccumT createAccumulator();
  AccumT addInput(AccumT accum, InputT input);
  AccumT mergeAccumulators(Iterable<AccumT> accums);
  OutputT extractOutput(AccumT accum);
}

KeyedCombineFn<K, InputT, AccumT, OutputT> {
  AccumT createAccumulator(K key);
  AccumT addInput(K key, AccumT accum, InputT input);
  AccumT mergeAccumulators(K key, Iterable<AccumT> accums);
  OutputT extractOutput(K key, AccumT accum);
}

So what are the particular reasons for this, versus a CombineFn that has
KVs as its input and accumulator types?

 - There are some performance improvements potentially from not passing
keys around, based on the assumption they are always available.

 - There is also a spec difference because it only has to be associative
and commutative per key, cannot be applied in a global combine, and
addInput is automatically key preserving.

But in fact, in all of my code crawling the class is almost never used
(even over the course of its history at Google) and even the few uses I
found were often mistakes where the key is totally ignored, probably
because a user thinks "I am doing a keyed combine so I need a keyed combine
function". So the number of users actually affected is about zero.

I would be curious if anyone has a compelling case for keeping
KeyedCombineFn.

Kenn

[1]
https://github.com/yafengguo/Apache-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java
[2] https://issues.apache.org/jira/browse/BEAM-1336
[3] https://github.com/apache/beam/pull/2627
[4]
https://github.com/search?l=Java&q=KeyedCombineFn&ref=advsearch&type=Code&utf8=%E2%9C%93
[5] https://www.google.com/search?q=KeyedCombineFn


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to