Robert Metzger created FLINK-28303:
--------------------------------------

             Summary: Kafka SQL Connector loses data when restoring from a 
savepoint with a topic with empty partitions
                 Key: FLINK-28303
                 URL: https://issues.apache.org/jira/browse/FLINK-28303
             Project: Flink
          Issue Type: Bug
          Components: Connectors / Kafka
    Affects Versions: 1.14.4
            Reporter: Robert Metzger


Steps to reproduce:
- Set up a Kafka topic with 10 partitions
- produce records 0-9 into the topic
- take a savepoint and stop the job
- produce records 10-19 into the topic
- restore the job from the savepoint.

The job will be missing usually 2-4 records from 10-19.

My assumption is that if a partition never had data (which is likely with 10 
partitions and 10 records), the savepoint will only contain offsets for 
partitions with data. 
While the job was offline (and we've written record 10-19 into the topic), all 
partitions got filled. Now, when Kafka comes online again, it will use the 
"latest" offset for those partitions, skipping some data.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to