Thanks Niels for a great talk. You have covered two of my pain areas - slim and broken streams. Since I am dealing with device data from on-prem data centers. The first option of generating fabricated watermark events is fine, however as mentioned in your talk how are you handling forwarding it to the next stream(next kafka topic) after enrichment. Have you got any solution for this?
-Hemant On Thu, Jul 23, 2020 at 12:05 PM Niels Basjes <ni...@basjes.nl> wrote: > Have a look at this presentation I gave a few weeks ago. > https://youtu.be/bQmz7JOmE_4 > > Niels Basjes > > On Wed, 22 Jul 2020, 08:51 bat man, <tintin0...@gmail.com> wrote: > >> Hi Team, >> >> Can someone share their experiences handling this. >> >> Thanks. >> >> On Tue, Jul 21, 2020 at 11:30 AM bat man <tintin0...@gmail.com> wrote: >> >>> Hello, >>> >>> I have a pipeline which consumes data from a Kafka source. Since, the >>> partitions are partitioned by device_id in case a group of devices is down >>> some partitions will not get normal flow of data. >>> I understand from documentation here[1] in flink 1.11 one can declare >>> the source idle - >>> WatermarkStrategy.<Tuple2<Long, String>>forBoundedOutOfOrderness( >>> Duration.ofSeconds(20)).withIdleness(Duration.ofMinutes(1)); >>> >>> How can I handle this in 1.9, since I am using aws emr and emr doesn't >>> have any release with the latest flink version. >>> >>> One way I could think of is to trigger watermark generation every 10 >>> minutes or so using Periodic watermarks. However, this will not be full >>> proof, are there any better way to handle this more dynamically. >>> >>> [1] - >>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/event_timestamps_watermarks.html#watermark-strategies-and-the-kafka-connector >>> >>> Thanks, >>> Hemant >>> >>>