Event de duplication in flink with rabbitmq connector

banu priya Tue, 09 Jul 2024 01:12:41 -0700

Hi All,

I have a Flink job with a RMQ source, tumbling windows (fires for each 2s),
an aggregator, then a RMQ sink. Incremental RocksDB checkpointing is
enabled with an interval of 5 minutes.


I was trying to understand Flink failure recovery. My checkpoint X is
started, I have sent one event to my source. As windows are triggered every
2s, my sink is updated with the aggregated result. But when I checked the
RabbitMQ console, my source queue still had unacked messages. (It is
expected and it is as per design
https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/datastream/rabbitmq/#rabbitmq-source
).

Now I restarted my task manager, as restart happens within the same
checkpoint interval and checkpoint X has not yet completed. The message is
not acknowledged and is sent again. Duplicate processing of events happens.

How to avoid these duplicates?


Thanks

Banu

Event de duplication in flink with rabbitmq connector

Reply via email to