[ 
https://issues.apache.org/jira/browse/FLINK-28719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578890#comment-17578890
 ] 

Chesnay Schepler commented on FLINK-28719:
------------------------------------------

> As far as I understand, op1 and op2 should have watermark 1 and 2 
> respectively, because those subtasks don't have any events in them (besides 1 
> and 2, of course, which create those watermarks). Then, why do they get the 
> maximum watermark of 7 after first step?

Watermarks are broadcasted to all downstream operators. So, if a source emits 
watermark 7, then all maps get it as an input.

> Also, why after op1 and op2, op4 and op5 get consumed? Is there any strategy 
> that dictates in which order to process subtasks?

Whichever arrives first at the downstream operator. You have 7 map subtasks all 
sending data to the window operators at the same time, and the order in which 
they arrive is not deterministic. So they _may_ arrive in a perfect round-robin 
pattern, or sequentially, or in any other pattern. The only guarantee is that 
the elements from a particular map subtask arrive in the same order that they 
were sent in.

> Also, why op3 to op7 have such watermarks: W2, W5, W6, W6,  W6, after first 
> step? I thought those subtasks should have watermark of Long.MinValue, 
> because there were no elements before?

The above 2 comments should answer it; all watermarks are sent to all map 
subtasks, and are consumed by the window operator in any order.

> Mapping a data source before window aggregation causes Flink to stop handle 
> late events correctly.
> --------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-28719
>                 URL: https://issues.apache.org/jira/browse/FLINK-28719
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>    Affects Versions: 1.15.1
>            Reporter: Mykyta Mykhailenko
>            Priority: Major
>
> I have created a 
> [repository|https://github.com/mykytamykhailenko/flink-map-with-issue] where 
> I describe this issue in detail. 
> I have provided a few tests and source code so that you can reproduce the 
> issue on your own machine. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to