Manikumar created KAFKA-9594: -------------------------------- Summary: speed up the processing of LeaderAndIsrRequest Key: KAFKA-9594 URL: https://issues.apache.org/jira/browse/KAFKA-9594 Project: Kafka Issue Type: Improvement Reporter: Jun Rao Assignee: Manikumar Fix For: 2.6.0
Observations from [~junrao] Currently, Partition.makerFollower() holds a write lock on leaderIsrUpdateLock. Partition.doAppendRecordsToFollowerOrFutureReplica() holds a read lock on leaderIsrUpdateLock. So, if there is an ongoing log append on the follower, the makeFollower() call will be delayed. This path is a bit different when serving the Partition.makeLeader() call. Before we make a call on Partition.makerLeader(), we first remove the follower from the replicaFetcherThread. So, the makerLeader() call won't be delayed because of log append. This means that when we change one follower to become leader and another follower to follow the new leader during a controlled shutdown, the makerLeader() call typically completes faster than the makeFollower() call, which can delay the follower fetching from the new leader and cause ISR to shrink. This only reason that Partition.doAppendRecordsToFollowerOrFutureReplica() needs to hold a read lock on leaderIsrUpdateLock is for Partiiton.maybeReplaceCurrentWithFutureReplica() to pause the log append while checking if the log dir could be replaced. We could potentially add a separate lock (sth like futureLogLock) that's synced between maybeReplaceCurrentWithFutureReplica() and doAppendRecordsToFollowerOrFutureReplica(). Then, doAppendRecordsToFollowerOrFutureReplica() doesn't need to hold the lock on leaderIsrUpdateLock. -- This message was sent by Atlassian Jira (v8.3.4#803005)