[ https://issues.apache.org/jira/browse/KAFKA-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235702#comment-17235702 ]
Lucas Wang commented on KAFKA-10751: ------------------------------------ PR submitted: https://github.com/apache/kafka/pull/9533 > Generate log to help estimate messages lost during ULE > ------------------------------------------------------ > > Key: KAFKA-10751 > URL: https://issues.apache.org/jira/browse/KAFKA-10751 > Project: Kafka > Issue Type: Improvement > Reporter: Lucas Wang > Assignee: Lucas Wang > Priority: Major > > During Unclean Leader Election, there could be data loss due to truncation at > the resigned leader. > Suppose there are 3 brokers that has replicas for a given partition: > Broker A (leader) with largest offset 9 (log end offset 10) > Broker B (follower) with largest offset 4 (log end offset 5) > Broker C (follower) with largest offset 1 (log end offset 2) > Only the leader A is in the ISR with B and C lagging behind. > Now an unclean leader election causes the leadership to be transferred to C. > Broker A would need to truncate 8 messages, and Broker B 3 messages. > Case 1: if these messages have been produced with acks=0 or 1, then clients > would experience 8 lost messages. > Case 2: if the client is using acks=all and the partition's minISR setting is > 2, and further let's assume broker B dropped out of the ISR after receiving > the message with offset 4, then only the messages with offset<=4 have been > acked to the client. The truncation effectively causes the client to lose 3 > messages. > Knowing the exact amount of data loss involves knowing the client's acks > setting when the messages are produced, and also whether the messages have > been sufficiently replicated according to the MinISR setting. > If getting the exact data loss is too involved, at least there should be logs > to help ESTIMATE the amount of data loss. -- This message was sent by Atlassian Jira (v8.3.4#803005)