Thanks David for your detailed answers.   Mans
    On Wednesday, December 11, 2019, 08:12:51 AM EST, David Anderson 
<da...@ververica.com> wrote:  
 
 
If we have allowed lateness to be greater than 0 (say 5), then if an event 
which arrives at window end + 3 (within allowed lateness), 

    (a) it is considered late and included in the window function as a late 
firing ?
An event with a timestamp that falls within the window's boundaries that 
arrives when the current watermark is at window end + 3 will be included as a 
late event that has arrived within the allowed lateness.

Actually, I'm not sure I got this right -- on this point I recommend some 
experimentation, or careful reading of the code.
On Wed, Dec 11, 2019 at 2:08 PM David Anderson <da...@ververica.com> wrote:

I'll attempt to answer your questions.

If we have allowed lateness to be greater than 0 (say 5), then if an event 
which arrives at window end + 3 (within allowed lateness), 
    (a) it is considered late and included in the window function as a late 
firing ?


An event with a timestamp that falls within the window's boundaries that 
arrives when the current watermark is at window end + 3 will be included as a 
late event that has arrived within the allowed lateness. 
    (b) Are the late firings under the control of the trigger ? 


Yes, the trigger is involved in all firings, late or not. 
    (c) If there are may events like this - are there multiple window function 
invocations ?

With the default event time trigger, each late event causes a late firing. You 
could use a custom trigger to implement other behaviors. 
    (d) Are these events (still within window end + allowed lateness) also 
emitted via the side output late data ?


No. The side output for late events is only used to collect events that fall 
outside the allowed lateness. 
2. If an event arrives after the window end + allowed lateness - 
    (a) Is it excluded from the window function but still emitted from the side 
output late data ?  


Yes. 
    (b) And if it is emitted is there any attribute which indicates for which 
window it was a late event ?  


No, the event is emitted without any additional information.

    (c) Is there any time limit while the late side output remains active for a 
particular window or all late events channeled to it ?

There is no time limit; the late side output remains operative indefinitely. 
Hope that helps,David
On Wed, Dec 11, 2019 at 1:40 PM M Singh <mans2si...@yahoo.com> wrote:

 Thanks Timo for your answer.  I will try the prototype but was wondering if I 
can find some theoretical documentation to give me a sound understanding.
Mans
    On Wednesday, December 11, 2019, 05:44:15 AM EST, Timo Walther 
<twal...@apache.org> wrote:  
 
 Little mistake: The key must be any constant instead of `e`.


On 11.12.19 11:42, Timo Walther wrote:
> Hi Mans,
> 
> I would recommend to create a little prototype to answer most of your 
> questions in action.
> 
> You can simple do:
> 
> stream = env.fromElements(1L, 2L, 3L, 4L)
>     .assignTimestampsAndWatermarks(
>         new AssignerWithPunctuatedWatermarks{
>             extractTimestamp(e) = e,
>             checkAndGetNextWatermark(e, ts) = new Watermark(e)
>     })
> 
> stream.keyBy(e -> e).window(...).print()
> env.execute()
> 
> This allows to quickly create a stream of event time for testing the 
> semantics.
> 
> I hope this helps. Otherwise of course we can help you in finding the 
> answers to the remaining questions.
> 
> Regards,
> Timo
> 
> 
> 
> On 10.12.19 20:32, M Singh wrote:
>> Hi:
>>
>> I have a few questions about the side output late data.
>>
>> Here is the API
>>
>> |stream .keyBy(...) <- keyed versus non-keyed windows .window(...) <- 
>> required: "assigner" [.trigger(...)] <- optional: "trigger" (else 
>> default trigger) [.evictor(...)] <- optional: "evictor" (else no 
>> evictor) [.allowedLateness(...)] <- optional: "lateness" (else zero) 
>> [.sideOutputLateData(...)] <- optional: "output tag" (else no side 
>> output for late data) .reduce/aggregate/fold/apply() <- required: 
>> "function" [.getSideOutput(...)] <- optional: "output tag"|
>>
>>
>>
>> Apache Flink 1.9 Documentation: Windows 
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#late-elements-considerations>
>>  
>>
>>
>>
>>
>>
>>     Apache Flink 1.9 Documentation: Windows
>>
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#late-elements-considerations>
>>  
>>
>>
>>
>> Here is the documentation:
>>
>>
>>       Late elements
>>      
>> considerations<https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#late-elements-considerations>
>>  
>>
>>
>> When specifying an allowed lateness greater than 0, the window along 
>> with its content is kept after the watermark passes the end of the 
>> window. In these cases, when a late but not dropped element arrives, 
>> it could trigger another firing for the window. These firings are 
>> called |late firings|, as they are triggered by late events and in 
>> contrast to the |main firing| which is the first firing of the window. 
>> In case of session windows, late firings can further lead to merging 
>> of windows, as they may “bridge” the gap between two pre-existing, 
>> unmerged windows.
>>
>> Attention You should be aware that the elements emitted by a late 
>> firing should be treated as updated results of a previous computation, 
>> i.e., your data stream will contain multiple results for the same 
>> computation. Depending on your application, you need to take these 
>> duplicated results into account or deduplicate them.
>>
>>
>> Questions:
>>
>> 1. If we have allowed lateness to be greater than 0 (say 5), then if 
>> an event which arrives at window end + 3 (within allowed lateness),
>>      (a) it is considered late and  included in the window function as 
>> a late firing ?
>>      (b) Are the late firings under the control of the trigger ?
>>      (c) If there are may events like this - are there multiple window 
>> function invocations ?
>>      (d) Are these events (still within window end + allowed lateness) 
>> also emitted via the side output late data ?
>> 2. If an event arrives after the window end + allowed lateness -
>>      (a) Is it excluded from the window function but still emitted 
>> from the side output late data ?
>>      (b) And if it is emitted is there any attribute which indicates 
>> for which window it was a late event ?
>>      (c) Is there any time limit while the late side output remains 
>> active for a particular window or all late events channeled to it ?
>>
>> Thanks
>>
>> Thanks
>>
>> Mans
>>
>>
>>
>>
>>

  

  

Reply via email to