Hi,
I think both fold and reduce fail to capture all the power or (what we call) 
combine. Reduce requires a function of type (T, T) -> T. It requires that the 
output type be the same as the input type. Fold takes a function (T, A) -> A 
where T is the input type and A is the accumulation type. Here, the output type 
can be different from the input type. However, there is no way of combining 
these aggregators so the operation is not distributive, i.e. we cannot 
hierarchically apply the operation.

Combine is the generalisation of this: We have three types, T (input), A 
(accumulator), O (output) and we require a function that can merge 
accumulators. The operation is distributive, meaning we can efficiently execute 
it and we can also have an output type that is different from the input type.

Quick FYI: in Flink the CombineFn is called AggregatingFunction and 
CombiningState is AggregatingState.

Best,
Aljoscha
> On 18. Apr 2017, at 04:29, Wesley Tanaka <wtan...@yahoo.com.INVALID> wrote:
> 
> As I start to understand Combine.Globally, it seems that it is, in spirit, 
> Beam's implementation of the "fold" higher-order function
> https://en.wikipedia.org/wiki/Fold_(higher-order_function)#Folds_in_various_languages
> 
> Was there a reason the word "combine" was picked instead of either "fold" or 
> "reduce"?  From the wikipedia list above, it seems as though "fold" and 
> "reduce" are in much more common usage, so either of those might be easier 
> for newcomers to understand.
> ---
> Wesley Tanaka
> http://wtanaka.com/

Reply via email to