[ 
https://issues.apache.org/jira/browse/FLINK-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839971#comment-15839971
 ] 

ASF GitHub Bot commented on FLINK-5529:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3191#discussion_r98028880
  
    --- Diff: docs/dev/windows.md ---
    @@ -795,15 +821,17 @@ necessarily the ones that arrive first or last.
     
     ## Allowed Lateness
     
    -When working with *event-time* windowing it can happen that elements 
arrive late, *i.e.* the watermark that Flink uses to 
    +When working with *event-time* windowing, it can happen that elements 
arrive late, *i.e.* the watermark that Flink uses to 
     keep track of the progress of event-time is already past the end timestamp 
of a window to which an element belongs. See 
     [event time](./event_time.html) and especially [late 
elements](./event_time.html#late-elements) for a more thorough 
     discussion of how Flink deals with event time.
     
    -By default, late elements are dropped if their associated window was 
already evaluated. However, 
    +By default, late elements are dropped when the watermark is past the end 
of the window. However, 
     Flink allows to specify a maximum *allowed lateness* for window operators. 
Allowed lateness 
    -specifies by how much time elements can be late before they are dropped. 
Elements that arrive 
    -within the allowed lateness of a window are still added to the window and 
trigger an immediate evaluation of the window which might emit elements.
    +specifies by how much time elements can be late before they are dropped, 
and its default value is 0. 
    +Elements that arrive after the watermark is past the end of the window but 
before it passes the end of 
    +window plus the allowed lateness, are still added to the window. Depending 
on the trigger used, 
    --- End diff --
    
    "passes the end of window" -> "**passed** the end of **the** window"


> Improve / extends windowing documentation
> -----------------------------------------
>
>                 Key: FLINK-5529
>                 URL: https://issues.apache.org/jira/browse/FLINK-5529
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Documentation
>            Reporter: Stephan Ewen
>            Assignee: Kostas Kloudas
>             Fix For: 1.2.0, 1.3.0
>
>
> Suggested Outline:
> {code}
> Windows
> (0) Outline: The anatomy of a window operation
>   stream
>      [.keyBy(...)]         <-  keyed versus non-keyed windows
>       .window(...)         <-  required: "assigner"
>      [.trigger(...)]       <-  optional: "trigger" (else default trigger)
>      [.evictor(...)]       <-  optional: "evictor" (else no evictor)
>      [.allowedLateness()]  <-  optional, else zero
>       .reduce/fold/apply() <-  required: "function"
> (1) Types of windows
>   - tumble
>   - slide
>   - session
>   - global
> (2) Pre-defined windows
>    timeWindow() (tumble, slide)
>    countWindow() (tumble, slide)
>      - mention that count windows are inherently
>        resource leaky unless limited key space
> (3) Window Functions
>   - apply: most basic, iterates over elements in window
>   
>   - aggregating: reduce and fold, can be used with "apply()" which will get 
> one element
>   
>   - forward reference to state size section
> (4) Advanced Windows
>   - assigner
>     - simple
>     - merging
>   - trigger
>     - registering timers (processing time, event time)
>     - state in triggers
>   - life cycle of a window
>     - create
>     - state
>     - cleanup
>       - when is window contents purged
>       - when is state dropped
>       - when is metadata (like merging set) dropped
> (5) Late data
>   - picture
>   - fire vs fire_and_purge: late accumulates vs late resurrects (cf 
> discarding mode)
>   
> (6) Evictors
>   - TDB
>   
> (7) State size: How large will the state be?
> Basic rule: Each element has one copy per window it is assigned to
>   --> num windows * num elements in window
>   --> example: tumbline is one copy, sliding(n,m) is n/m copies
>   --> per key
> Pre-aggregation:
>   - if reduce or fold is set -> one element per window (rather than num 
> elements in window)
>   - evictor voids pre-aggregation from the perspective of state
> Special rules:
>   - fold cannot pre-aggregate on session windows (and other merging windows)
> (8) Non-keyed windows
>   - all elements through the same windows
>   - currently not parallel
>   - possible parallel in the future when having pre-aggregation functions
>   - inherently (by definition) produce a result stream with parallelism one
>   - state similar to one key of keyed windows
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to