[ 
https://issues.apache.org/jira/browse/KAFKA-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868153#comment-16868153
 ] 

Vikas Singh commented on KAFKA-8001:
------------------------------------

Comment from Jason on PR that need to be taken care of:
{noformat}
I was discussing with @apovzner and we realized that there may be a few more 
issues here. For a normal replica, whenever we observe an epoch bump, we go 
through a reconciliation protocol to find a truncation point at which we know 
it is safe to resume fetching. Basically it works like this:

Follower observes epoch bump and enters Truncating state.
Follower sends OffsetsForLeaderEpoch query to leader with the latest epoch from 
its log.
Leader looks in its local log to find the largest epoch less than or equal to 
the requested epoch and returns its end offset.
The follower will truncate to this offset and then possibly go back to 2. if 
additional truncation is needed.
Once truncation is complete, the follower enters the Fetching state.

For a future replica, the protocol is basically the same, but rather than 
sending OffsetsForLeaderEpoch to the leader, we use the state from the local 
replica which may or may not be the leader of the bumped epoch. The basic 
problem we realized is that this log reconciliation is not safe while the local 
replica is in the Truncating state and we have nothing at the moment to 
guarantee that it is not in that state. Basically we have to wait until it has 
reached step 5 before the future replica can do its own truncation.

I suspect probably what we have to do to fix this problem is move the state of 
the fetcher (i.e. Truncating|Fetching and the current epoch) out of 
AbstractFetcherThread and into something which can be accessed by alter log dir 
fetcher. The most obvious candidate seems like kafka.cluster.Replica. 
Unfortunately, this may not be a small work, but I cannot think how to make 
this protocol work unless we do it.{noformat}

> Fetch from future replica stalls when local replica becomes a leader
> --------------------------------------------------------------------
>
>                 Key: KAFKA-8001
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8001
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.1.0, 2.1.1
>            Reporter: Anna Povzner
>            Assignee: Vikas Singh
>            Priority: Critical
>
> With KIP-320, fetch from follower / future replica returns 
> FENCED_LEADER_EPOCH if current leader epoch in the request is lower than the 
> leader epoch known to the leader (or local replica in case of future replica 
> fetching). In case of future replica fetching from the local replica, if 
> local replica becomes the leader of the partition, the next fetch from future 
> replica fails with FENCED_LEADER_EPOCH and fetching from future replica is 
> stopped until the next leader change. 
> Proposed solution: on local replica leader change, future replica should 
> "become a follower" again, and go through the truncation phase. Or we could 
> optimize it, and just update partition state of the future replica to reflect 
> the updated current leader epoch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to