[ https://issues.apache.org/jira/browse/FLINK-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839866#comment-15839866 ]
ASF GitHub Bot commented on FLINK-5529: --------------------------------------- Github user kl0u commented on the issue: https://github.com/apache/flink/pull/3191 @aljoscha thanks for the review. I integrated the comments. I also include @fhueske as he also contributed to this. > Improve / extends windowing documentation > ----------------------------------------- > > Key: FLINK-5529 > URL: https://issues.apache.org/jira/browse/FLINK-5529 > Project: Flink > Issue Type: Sub-task > Components: Documentation > Reporter: Stephan Ewen > Assignee: Kostas Kloudas > Fix For: 1.2.0, 1.3.0 > > > Suggested Outline: > {code} > Windows > (0) Outline: The anatomy of a window operation > stream > [.keyBy(...)] <- keyed versus non-keyed windows > .window(...) <- required: "assigner" > [.trigger(...)] <- optional: "trigger" (else default trigger) > [.evictor(...)] <- optional: "evictor" (else no evictor) > [.allowedLateness()] <- optional, else zero > .reduce/fold/apply() <- required: "function" > (1) Types of windows > - tumble > - slide > - session > - global > (2) Pre-defined windows > timeWindow() (tumble, slide) > countWindow() (tumble, slide) > - mention that count windows are inherently > resource leaky unless limited key space > (3) Window Functions > - apply: most basic, iterates over elements in window > > - aggregating: reduce and fold, can be used with "apply()" which will get > one element > > - forward reference to state size section > (4) Advanced Windows > - assigner > - simple > - merging > - trigger > - registering timers (processing time, event time) > - state in triggers > - life cycle of a window > - create > - state > - cleanup > - when is window contents purged > - when is state dropped > - when is metadata (like merging set) dropped > (5) Late data > - picture > - fire vs fire_and_purge: late accumulates vs late resurrects (cf > discarding mode) > > (6) Evictors > - TDB > > (7) State size: How large will the state be? > Basic rule: Each element has one copy per window it is assigned to > --> num windows * num elements in window > --> example: tumbline is one copy, sliding(n,m) is n/m copies > --> per key > Pre-aggregation: > - if reduce or fold is set -> one element per window (rather than num > elements in window) > - evictor voids pre-aggregation from the perspective of state > Special rules: > - fold cannot pre-aggregate on session windows (and other merging windows) > (8) Non-keyed windows > - all elements through the same windows > - currently not parallel > - possible parallel in the future when having pre-aggregation functions > - inherently (by definition) produce a result stream with parallelism one > - state similar to one key of keyed windows > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)