hachikuji commented on PR #12499:
URL: https://github.com/apache/kafka/pull/12499#issuecomment-1211245795

   > So we prevent sending the request if the epoch is lower? And is it the 
case, that there is always a controller with an epoch at least as large? Or in 
some cases would we need to wait/retry until such a controller exists?
   
   @jolshan Yes, that is right. Ensuring some level of monotonicity seems like 
a good general change even outside the original bug. It is weird to allow the 
broker to send requests to a controller that it knows for sure is stale, and it 
makes the system harder to reason about. 
   
   One thing I have been trying to think through is how this bug affects kraft. 
The kraft controller will also return `FENCED_LEADER_EPOCH` if the leader epoch 
is higher than what it has in its cache. But does kraft give us a stronger 
guarantee to work with? I think it could, but at the moment, we do not 
proactively reset the controller in `BrokerToControllerChannelManager` after we 
discover a new controller. We only reset it after we receive a `NOT_CONTROLLER` 
error in a request. So it seems to me that we could still hit the same problem 
with kraft. As a matter of fact, I think this patch does not fix the kraft 
problem because we do not propagate the controller epoch down to `Partition` so 
that it can be used in `AlterPartition` requests. cc @jsancio 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to