Hi team

In Kafka Sink docs [1], with EXACTLY_ONCE it is recommended to set:
transaction_timeout  > maximum_checkpoint duration + maximum_restart_duration.

I understand transaction_timeout > maximum_checkpoint_duration
But why adding maximum_restart_duration?

If the application recovers from a checkpoint, any uncommitted message
that was written after the last successful checkpoint will be
re-written regardless.
If a transaction times out during the recovery it doesn't matter.

I would rather say:
transaction_timeout > maximum_checkpoint duration + checkpoint_interval

Any thoughts?

Regards
Lorenzo

[1] 
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/datastream/kafka/#fault-tolerance

Reply via email to