[ 
https://issues.apache.org/jira/browse/KAFKA-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508907#comment-16508907
 ] 

Lucas Wang commented on KAFKA-7040:
-----------------------------------

To be more specific, I think the following sequence of events may cause a 
truncation below HW.

Say currently both broker0 and broker1 have finished processing of a 
LeaderAndISR request with leader being broker1, and leader epoch 10, and both 
of them have 100 messages in their respective log (with the largest offset 99, 
and LEO of 100).

1. The replica fetcher thread on a broker0 issues a LeaderEpoch request with 
leader epoch 10 to broker1, and broker1 replies with a LEO of 100 given it's 
the latest leader epoch. Before the replica fetcher on broker0 acquires the 
AbstractFetcherThread.partitionMapLock and processes the LeaderEpoch response, 
it goes through steps 2-4 first.
2. A LeaderAndISR request causes broker0 to become the leader for one partition 
t1p0, which in turn will remove the partition t1p0 from the replica fetcher 
thread
3. Broker0 accepts one message at offset 100 from a producer, and the message 
gets replicated to broker1, causing the HW on broker0 to go up to 100.
4. A 2nd LeaderAndISR request causes broker1 to become the leader, and broker0 
to become the follower for partition t1p0. This will cause the partition t1p0 
to be added back to the replica fetcher thread on broker0.
5. The replica fetcher thread on broker0 processes the LeaderEpoch response 
received in step 1, and truncates the accepted message with offset 100 in step3.

> The replica fetcher thread may truncate accepted messages during multiple 
> fast leadership transitions
> -----------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7040
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7040
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Lucas Wang
>            Priority: Minor
>
> Problem Statement:
> Consider the scenario where there are two brokers, broker0, and broker1, and 
> there are two partitions "t1p0", and "t1p1"[1], both of which have broker1 as 
> the leader and broker0 as the follower. The following sequence of events 
> happened on broker0
> 1. The replica fetcher thread on a broker0 issues a LeaderEpoch request to 
> broker1, and awaits to get the response
> 2. A LeaderAndISR request causes broker0 to become the leader for one 
> partition t1p0, which in turn will remove the partition t1p0 from the replica 
> fetcher thread
> 3. Broker0 accepts some messages from a producer
> 4. A 2nd LeaderAndISR request causes broker1 to become the leader, and 
> broker0 to become the follower for partition t1p0. This will cause the 
> partition t1p0 to be added back to the replica fetcher thread on broker0.
> 5. The replica fetcher thread on broker0 receives a response for the 
> LeaderEpoch request issued in step 1, and truncates the accepted messages in 
> step3.
> The issue can be reproduced with the test from 
> https://github.com/gitlw/kafka/commit/8956e743f0e432cc05648da08c81fc1167b31bea
> [1] Initially we set up broker0 to be the follower of two partitions instead 
> of just one, to avoid the shutting down of the replica fetcher thread when it 
> becomes idle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to