Hello,

I have a use case with event-time processing that ends up with a DAG roughly 
like this:

source -> filter -> keyBy -> watermark -> window -> process -> keyBy_2 -> 
connect (KeyedCoProcessFunction) -> sink
           |                                                               /
      (side output) -> keyBy -> watermark --------------------------------/


(In case the text gets mangled in the e-mail, the side output comes from the 
filter and joins back with the connect operation)

The filter takes all data and its main output is all _valid_ data with state 
"open"; the side output is all _valid_ data regardless of state.

The goal of the KeyedCoProcessFunction is to check the results of the (sliding) 
window. The window only receives open states, but KeyedCoProcessFunction 
receives all valid data and should discard results from the main stream if 
states changed from "open" to something else before the window was evaluated.

I would have expected all data from the side output to be processed roughly 
immediately in KeyedCoProcessFunction's processElement2 because there's no 
windowing in the side stream, but I'm not sure if this actually happens, maybe 
the side stream (or both streams) buffers some data before passing it to the 
connected stream? If yes, is there any way I could tune this?

Regards,
Alexis.

Reply via email to