Chris Egerton created KAFKA-13469:
-------------------------------------

             Summary: End-of-life offset commit for source task can take place 
before all records are flushed
                 Key: KAFKA-13469
                 URL: https://issues.apache.org/jira/browse/KAFKA-13469
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
    Affects Versions: 3.1.0, 3.0.1
            Reporter: Chris Egerton
            Assignee: Chris Egerton
             Fix For: 3.1.0, 3.0.1


When we fixed KAFKA-12226, we made offset commits for source tasks take place 
without blocking for any in-flight records to be acknowledged. While a task is 
running, this change should yield significant benefits in some cases and allow 
us to continue to commit offsets even when a topic partition on the broker is 
unavailable or the producer is unable to send records to Kafka as quickly as 
they are produced by the task.

However, this becomes problematic when a task is scheduled for shutdown with 
in-flight records. During shutdown, the latest committable offsets are 
calculated, and then flushed to the offset backing store (in distributed mode, 
this is the offsets topic). During that flush, the task's producer may continue 
to send records to Kafka, but their offsets will not be committed, which causes 
these records to be redelivered if/when the task is restarted.

Essentially, duplicate records are now possible even in healthy source tasks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to