Hi,

I'm planning to build a pipeline that is using Kafka source, some stateful
transformation and a RabbitMQ sink. What I don't yet fully understand is
how common should I expect the "at-least once" scenario (ie. seeing
duplicates) on the sink side. The case when things start failing is clear
to me, but what happens when I want to gracefully stop the Flink job?

Am I right in thinking that when I gracefully stop a job with a final
savepoint [1] then what happens is that Kafka source stops consuming, a
checkpoint barrier is sent through the pipeline and this will flush the
sink completely? So my understanding is that if nothing fails and that
Kafka offset is committed, when the job is started again from that
savepoint, it will not result in any duplicates being sent to RabbitMQ. Is
that correct?

Thanks!

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint

-- 
Piotr Domagalski

Reply via email to