Hi All, I have a Flink job with a RMQ source, tumbling windows (fires for each 2s), an aggregator, then a RMQ sink. Incremental RocksDB checkpointing is enabled with an interval of 5 minutes.
I was trying to understand Flink failure recovery. My checkpoint X is started, I have sent one event to my source. As windows are triggered every 2s, my sink is updated with the aggregated result. But when I checked the RabbitMQ console, my source queue still had unacked messages. (It is expected and it is as per design https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/datastream/rabbitmq/#rabbitmq-source ). Now I restarted my task manager, as restart happens within the same checkpoint interval and checkpoint X has not yet completed. The message is not acknowledged and is sent again. Duplicate processing of events happens. How to avoid these duplicates? Thanks Banu