Lucas Wang created KAFKA-10734:
----------------------------------

             Summary: Speedup the processing of LeaderAndIsr request
                 Key: KAFKA-10734
                 URL: https://issues.apache.org/jira/browse/KAFKA-10734
             Project: Kafka
          Issue Type: Improvement
            Reporter: Lucas Wang
            Assignee: Lucas Wang


Consider the case where a LeaderAndIsr request contains many partitions, of 
which the broker is asked to become the follower. Let's call these partitions 
*partitionsToMakeFollower*. Further more, let's assume the cluster has n 
brokers and each broker is configured to have m replica fetchers (via the 
num.replica.fetchers config). 
The broker is likely to have (n-1) * m fetcher threads.
Processing the LeaderAndIsr request requires
1. removing the "partitionsToMakeFollower" from all of the fetcher threads 
sequentially so that they won't be fetching from obsolete leaders.

2. adding the "partitionsToMakeFollower" to all of the fetcher threads 
sequentially

3. shutting down the idle fetcher threads sequentially (by checking the number 
of partitions held by each fetcher thread)

On top of that, for each of the 3 operations above, the operation is handled by 
the request handler thread (i.e. io thread). And to complete the operation, the 
request handler thread needs to contend for the "partitionMapLock" with the 
corresponding fetcher thread. In the worst case, the request handler thread is 
blocked for (n-1) * m times for removing the partitions, another (n-1) * m 
times for adding the partitions, and yet another (n-1) * m times for shutting 
down the idle fetcher threads.

Overall, all of the blocking can result in a significant delay in processing 
the LeaderAndIsr request. The further implication is that if the follower 
delays its fetching from the leader, there could be under MinISR partitions in 
the cluster, causing unavailability for clients.

This ticket is created to track speedup in the processing of the LeaderAndIsr 
request.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to