Hi:
I have a few questions about the side output late data.  
Here is the API
stream
       .keyBy(...)               <-  keyed versus non-keyed windows
       .window(...)              <-  required: "assigner"
      [.trigger(...)]            <-  optional: "trigger" (else default trigger)
      [.evictor(...)]            <-  optional: "evictor" (else no evictor)
      [.allowedLateness(...)]    <-  optional: "lateness" (else zero)
      [.sideOutputLateData(...)] <-  optional: "output tag" (else no side 
output for late data)
       .reduce/aggregate/fold/apply()      <-  required: "function"
      [.getSideOutput(...)]      <-  optional: "output tag"

Apache Flink 1.9 Documentation: Windows

| 
| 
|  | 
Apache Flink 1.9 Documentation: Windows


 |

 |

 |



Here is the documentation:
Late elements considerations

When specifying an allowed lateness greater than 0, the window along with its 
content is kept after the watermark passes the end of the window. In these 
cases, when a late but not dropped element arrives, it could trigger another 
firing for the window. These firings are called late firings, as they are 
triggered by late events and in contrast to the main firing which is the first 
firing of the window. In case of session windows, late firings can further lead 
to merging of windows, as they may “bridge” the gap between two pre-existing, 
unmerged windows.

Attention You should be aware that the elements emitted by a late firing should 
be treated as updated results of a previous computation, i.e., your data stream 
will contain multiple results for the same computation. Depending on your 
application, you need to take these duplicated results into account or 
deduplicate them.

Questions:
1. If we have allowed lateness to be greater than 0 (say 5), then if an event 
which arrives at window end + 3 (within allowed lateness),     (a) it is 
considered late and  included in the window function as a late firing ?      
(b) Are the late firings under the control of the trigger ?      (c) If there 
are may events like this - are there multiple window function invocations ?     
(d) Are these events (still within window end + allowed lateness) also emitted 
via the side output late data ?2. If an event arrives after the window end + 
allowed lateness -     (a) Is it excluded from the window function but still 
emitted from the side output late data ?      (b) And if it is emitted is there 
any attribute which indicates for which window it was a late event ?      (c) 
Is there any time limit while the late side output remains active for a 
particular window or all late events channeled to it ?
Thanks
Thanks
Mans




Reply via email to