+1 Regards JB
On 04/21/2017 07:24 PM, Kenneth Knowles wrote:
Hi all, I propose that we remove KeyedCombineFn before the first stable release. I don't think it adds enough value for the complexity it adds to e.g. CombineWithContext [1] and state [2, 3], and it doesn't seem to me that users really use it when we might expect. I am happy to be demonstrated wrong. It is very likely that you have never written [4, 5] or thought about KeyedCombineFn. So for context, here are excepts from signatures just to show the difference from CombineFn: CombineFn<InputT, AccumT, OutputT> { AccumT createAccumulator(); AccumT addInput(AccumT accum, InputT input); AccumT mergeAccumulators(Iterable<AccumT> accums); OutputT extractOutput(AccumT accum); } KeyedCombineFn<K, InputT, AccumT, OutputT> { AccumT createAccumulator(K key); AccumT addInput(K key, AccumT accum, InputT input); AccumT mergeAccumulators(K key, Iterable<AccumT> accums); OutputT extractOutput(K key, AccumT accum); } So what are the particular reasons for this, versus a CombineFn that has KVs as its input and accumulator types? - There are some performance improvements potentially from not passing keys around, based on the assumption they are always available. - There is also a spec difference because it only has to be associative and commutative per key, cannot be applied in a global combine, and addInput is automatically key preserving. But in fact, in all of my code crawling the class is almost never used (even over the course of its history at Google) and even the few uses I found were often mistakes where the key is totally ignored, probably because a user thinks "I am doing a keyed combine so I need a keyed combine function". So the number of users actually affected is about zero. I would be curious if anyone has a compelling case for keeping KeyedCombineFn. Kenn [1] https://github.com/yafengguo/Apache-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java [2] https://issues.apache.org/jira/browse/BEAM-1336 [3] https://github.com/apache/beam/pull/2627 [4] https://github.com/search?l=Java&q=KeyedCombineFn&ref=advsearch&type=Code&utf8=%E2%9C%93 [5] https://www.google.com/search?q=KeyedCombineFn
-- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com