[ https://issues.apache.org/jira/browse/KAFKA-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949834#comment-16949834 ]
Stanislav Kozlovski commented on KAFKA-8667: -------------------------------------------- [~hzxa21] have you considered submitting a patch for this along with KAFKA-8668? I saw you've authored patches for these JIRAs in LinkedIn's open-source Kafka - [https://github.com/linkedin/kafka/commit/feed875f8fcd8b9f8b8539e8a9b2e477a67b2faf] > Improve leadership transition time > ---------------------------------- > > Key: KAFKA-8667 > URL: https://issues.apache.org/jira/browse/KAFKA-8667 > Project: Kafka > Issue Type: Improvement > Reporter: Zhanxiang (Patrick) Huang > Assignee: Zhanxiang (Patrick) Huang > Priority: Major > > When the replica fetcher thread processes fetch response, it will hold the > {{partitionMapLock}}. If at the same time, a LeaderAndIsr request comes in, > it will be blocked at the end of its processing when calling > {{shutdownIdleFetcherThread}} because it will need to wait for the > {{partitionMapLock}} of each replica fetcher thread to be acquired to check > whether there is any partition assigned to each fetcher and the request > handler thread performs this check sequentially for the fetcher threads > For example, in a cluster with 20 brokers and num.replica.fetcher.thread set > to 32, if each fetcher thread holds lock for a little bit longer, the total > time for the request handler thread to finish shutdownIdleFetcherThread can > be a lot larger due to waiting for the partitionMapLock for a longer time for > each fetcher thread. If the LeaderAndIsr gets blocked for >request.timeout.ms > (default to 30s) in the broker, request send thread in the controller side > will timeout while waiting for the response and try to establish a new > connection to the broker and re-send the request, which will break in-order > delivery because we will have more than one channel talking to the broker. > Moreover, this may make the lock contention problem worse or saturate request > handler threads because duplicate control requests are sent to the broker for > multiple time. In our own testing, we saw up to *8 duplicate > LeaderAndIsrRequest* being sent to the broker during bounce and the 99th > LeaderAndIsr local time goes up to ~500s. -- This message was sent by Atlassian Jira (v8.3.4#803005)