Thanks Niels for a great talk. You have covered two of my pain areas - slim
and broken streams. Since I am dealing with device data from on-prem data
centers. The first option of generating fabricated watermark events is
fine, however as mentioned in your talk how are you handling forwarding it
to the next stream(next kafka topic) after enrichment. Have you got any
solution for this?

-Hemant

On Thu, Jul 23, 2020 at 12:05 PM Niels Basjes <ni...@basjes.nl> wrote:

> Have a look at this presentation I gave a few weeks ago.
> https://youtu.be/bQmz7JOmE_4
>
> Niels Basjes
>
> On Wed, 22 Jul 2020, 08:51 bat man, <tintin0...@gmail.com> wrote:
>
>> Hi Team,
>>
>> Can someone share their experiences handling this.
>>
>> Thanks.
>>
>> On Tue, Jul 21, 2020 at 11:30 AM bat man <tintin0...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have a pipeline which consumes data from a Kafka source. Since, the
>>> partitions are partitioned by device_id in case a group of devices is down
>>> some partitions will not get normal flow of data.
>>> I understand from documentation here[1] in flink 1.11 one can declare
>>> the source idle -
>>> WatermarkStrategy.<Tuple2<Long, String>>forBoundedOutOfOrderness(
>>> Duration.ofSeconds(20)).withIdleness(Duration.ofMinutes(1));
>>>
>>> How can I handle this in 1.9, since I am using aws emr and emr doesn't
>>> have any release with the latest flink version.
>>>
>>> One way I could think of is to trigger watermark generation every 10
>>> minutes or so using Periodic watermarks. However, this will not be full
>>> proof, are there any better way to handle this more dynamically.
>>>
>>> [1] -
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/event_timestamps_watermarks.html#watermark-strategies-and-the-kafka-connector
>>>
>>> Thanks,
>>> Hemant
>>>
>>>

Reply via email to