Hello,

I was reading about Flink's checkpoint and wanted to check if I correctly
understood the usage of barriers for exactly once processing.
 1) Operator does alignment by buffering records coming after a barrier
until it receives barrier from all upstream operators instances.
 2) Barrier is always preceded by a watermark to trigger processing all
windows that are complete.
 3) Records in windows that are not triggered are also saved as part of
checkpoint. These windows are repopulated when restoring from checkpoints.

In production setups, were there any cases where alignment during
checkpointing caused unacceptable latency?
If so, is there a way to indicate say wait for a MAX 100 ms? That way we
have exactly-once in most situations but prefer at least once over higher
latency in corner cases.

Srikanth

Reply via email to