dajac edited a comment on pull request #11965:
URL: https://github.com/apache/kafka/pull/11965#issuecomment-1083225964


   @bozhao12 Thanks for reporting this one. I took a deeper look at it and I 
agree with your finding. The follower runs the preferred read replica selection 
logic as well. Nice one ;)
   
   In most of the cases, it still works because, as you said, the 
`RackAwareReplicaSelector` returns the leader when it cannot find a replica in 
the same rack and the leader is filtered out in the logic. If you have more 
than 1 replicas per rack, you can easily get in a situation where the consumer 
can't consumer anything because it is redirected continuously between the two 
replicas. This is only possible if the replica has still some replica states 
left around from it previous leadership. This can happen when the partition is 
reassigned multiple times for instance.
   
   I suppose that things could be worse depending on the implementation of the 
`ReplicaSelector`. Luckily most people use the `RackAwareReplicaSelector` and 
three replicas.
   
   That seems to be a regression introduced in this commit: 
https://github.com/apache/kafka/commit/fbfda2c4ad889c731aa52b5214e0521f187f8db6.
 
   
   I do agree that we should fix this but we need to come up with a better test 
which verifies this. Your current test does not really fail because the leader 
is automatically removed. We could perhaps create a `MockSelector` which 
implements `ReplicaSelector` and incremented a counter or something along those 
line. What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to