Jason Gustafson created KAFKA-13141:
---------------------------------------

             Summary: Leader should not update follower fetch offset if 
diverging epoch is present
                 Key: KAFKA-13141
                 URL: https://issues.apache.org/jira/browse/KAFKA-13141
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 2.7.1, 2.8.0
            Reporter: Jason Gustafson
            Assignee: Jason Gustafson
             Fix For: 3.0.0, 2.7.2, 2.8.1


In 2.7, we began doing fetcher truncation piggybacked on the Fetch protocol 
instead of using the old OffsetsForLeaderEpoch API. When truncation is 
detected, we return a `divergingEpoch` field in the Fetch response, but we do 
not set an error code. The sender is expected to check if the diverging epoch 
is present and truncate accordingly.

All of this works correctly in the fetcher implementation, but the problem is 
that the logic to update the follower fetch position on the leader does not 
take into account the diverging epoch present in the response. This means the 
fetch offsets can be updated incorrectly, which can lead to either log 
divergence or the loss of committed data.

For example, we hit the following case with 3 replicas. Leader 1 is elected in 
epoch 1 with an end offset of 100. The followers are at offset 101

Broker 1: (Leader) Epoch 1 from offset 100
Broker 2: (Follower) Epoch 1 from offset 101
Broker 3: (Follower) Epoch 1 from offset 101

Broker 1 receives fetches from 2 and 3 at offset 101. The leader detects the 
divergence and returns a diverging epoch in the fetch state. Nevertheless, the 
fetch positions for both followers are updated to 101 and the high watermark is 
advanced.

After brokers 2 and 3 had truncated to offset 100, broker 1 experienced a 
network partition of some kind and was kicked from the ISR. This caused broker 
2 to get elected, which resulted in the following state at the start of epoch 2.

Broker 1: (Follower) Epoch 2 from offset 101
Broker 2: (Leader) Epoch 2 from offset 100
Broker 3: (Follower) Epoch 2 from offset 100

Broker 2 was then able to write a new entry at offset 100 and the old record 
which may have been exposed to consumers was deleted by broker 1.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to