Hi Ken, You are right. The merge() method combines partial aggregates, similar to a combinable reducer.
The only situation when merge() is called in a DataStream job (that I am aware of) is when session windows get merged. For example when you define a session window with 30 minute gap and you receive the following records R1, 12:00:00 R2, 12:05:00 R3, 12:40:00 R4, 12:20:00 In this case, Flink R1 will create a new window W1, R2 will be assigned to W1, R3 creates a new window W2, and R4 connects and merges W1 and W2. Best, Fabian 2018-05-05 0:46 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>: > I’m trying to figure out when/why the AggregateFunction.merge() > <https://ci.apache.org/projects/flink/flink-docs-release-1.4/api/java/org/apache/flink/api/common/functions/AggregateFunction.html#merge-ACC-ACC-> > method is called in a streaming job, to ensure I’ve implemented it > properly. > > The documentation for AggregateFunction says "Merging intermediate > aggregates (partial aggregates) means merging the accumulators.” > > But that sounds more like a combiner in batch processing, not streaming. > > From the code, it seems like this could be called if a > MergingWindowAssigner is used, right? > > And is there any other situation in streaming where merge() could be > called? > > Thanks, > > — Ken > > -------------------------------------------- > http://about.me/kkrugler > +1 530-210-6378 > >