Hi,

We have a single Flink job that works on data from multiple data sources. These 
data sources are not aligned in time and also have intermittent connectivity 
lasting for days, due to which data will arrive late

We attempted to use the event time and watermarks with parallel streams using 
keyby for the data source

In case of parallel streams, for certain operators, the event time clock across 
all the subtasks  of the operator is the minimum value of the watermark among 
all its input streams. 

Reference: 
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/event_time.html#watermarks
 in-parallel-streams

While this seems to be a fundamental concept of Flink, are there any plans of 
having event  time clock per operator per subtask for such operators?

This is causing us, not to use watermarks and to fallback on processing time 
semantics or in the worst case running the same Flink job for each and every 
different data source from which we are collecting data through Kafka

Thanks,
Sush

Reply via email to