Lucas Wang created KAFKA-10751:
----------------------------------

             Summary: Generate log to help estimate messages lost during ULE
                 Key: KAFKA-10751
                 URL: https://issues.apache.org/jira/browse/KAFKA-10751
             Project: Kafka
          Issue Type: Improvement
            Reporter: Lucas Wang
            Assignee: Lucas Wang


During Unclean Leader Election, there could be data loss due to truncation at 
the resigned leader.

Suppose there are 3 brokers that has replicas for a given partition:
Broker A (leader) with largest offset 9 (log end offset 10)
Broker B (follower) with largest offset 4 (log end offset 5)
Broker C (follower) with largest offset 1 (log end offset 2)

Only the leader A is in the ISR with B and C lagging behind.
Now an unclean leader election causes the leadership to be transferred to C. 
Broker A would need to truncate 8 messages, and Broker B 3 messages.

Case 1: if these messages have been produced with acks=0 or 1, then clients 
would experience 8 lost messages.
Case 2: if the client is using acks=all and the partition's minISR setting is 
2, and further let's assume broker B dropped out of the ISR after receiving the 
message with offset 4, then only the messages with offset<=4 have been acked to 
the client. The truncation effectively causes the client to lose 3 messages.

Knowing the exact amount of data loss involves knowing the client's acks 
setting when the messages are produced, and also whether the messages have been 
sufficiently replicated according to the MinISR setting.


If getting the exact data loss is too involved, at least there should be logs 
to help ESTIMATE the amount of data loss.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to