Hello,
This is Arun from LINE Corporation.
We have a cluster with a large number of brokers (200+), node failures are 
bound to happen relatively often. Upon recovery of the machine, or upon 
reassignment of the replicas on failed node, we often have a large amount of 
lagging replica catch up. Multiple replicas (re-)assigned to a target broker 
could start fetching from the same source broker id holding the leader replica. 
This occasionally leads to Produce Response Time degradation as illustrated in 
https://issues.apache.org/jira/browse/KAFKA-10690 .
​
Wanted to check if this is faced by anyone else, and if a solution merits a KIP.
​
With Regards​
マテュアルン Mathew Arun
LINE Corporation

Reply via email to