dajac edited a comment on pull request #11965: URL: https://github.com/apache/kafka/pull/11965#issuecomment-1083225964
@bozhao12 Thanks for reporting this one. I took a deeper look at it and I agree with your finding. The follower runs the preferred read replica selection logic as well. Nice one ;) In most of the cases, it still works because, as you said, the `RackAwareReplicaSelector` returns the leader when it cannot find a replica in the same rack and the leader is filtered out in the logic. If you have more than 1 replicas per rack, you can easily get in a situation where the consumer can't consumer anything because it is redirected continuously between the two replicas. This is only possible if the replica has still some replica states left around from it previous leadership. This can happen when the partition is reassigned multiple times for instance. I suppose that things could be worse depending on the implementation of the `ReplicaSelector`. Luckily most people use the `RackAwareReplicaSelector` and three replicas. That seems to be a regression introduced in this commit: https://github.com/apache/kafka/commit/fbfda2c4ad889c731aa52b5214e0521f187f8db6. I do agree that we should fix this but we need to come up with a better test which verifies this. Your current test does not really fail because the leader is automatically removed. We could perhaps create a `MockSelector` which implements `ReplicaSelector` and incremented a counter or something along those line. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org