Hello,

I have a pipeline which consumes data from a Kafka source. Since, the
partitions are partitioned by device_id in case a group of devices is down
some partitions will not get normal flow of data.
I understand from documentation here[1] in flink 1.11 one can declare the
source idle -
WatermarkStrategy.<Tuple2<Long, String>>forBoundedOutOfOrderness(Duration.
ofSeconds(20)).withIdleness(Duration.ofMinutes(1));

How can I handle this in 1.9, since I am using aws emr and emr doesn't have
any release with the latest flink version.

One way I could think of is to trigger watermark generation every 10
minutes or so using Periodic watermarks. However, this will not be full
proof, are there any better way to handle this more dynamically.

[1] -
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/event_timestamps_watermarks.html#watermark-strategies-and-the-kafka-connector

Thanks,
Hemant

Reply via email to