Zhanxiang (Patrick) Huang created KAFKA-8571: ------------------------------------------------
Summary: Not complete delayed produce requests when processing StopReplicaRequest causing high produce latency for acks=all Key: KAFKA-8571 URL: https://issues.apache.org/jira/browse/KAFKA-8571 Project: Kafka Issue Type: Bug Reporter: Zhanxiang (Patrick) Huang Assignee: Zhanxiang (Patrick) Huang Currently a broker will only attempt to complete delayed requests upon highwater mark changes and receiving LeaderAndIsrRequest. When a broker receives StopReplicaRequest, it will not try to complete delayed operations including delayed produce for acks=all, which can cause the producer to timeout even though the producer should have attempted to talk to the new leader faster if a NotLeaderForPartition error is sent. This can happen during partition reassignment when controller is trying to kick the previous leader out of the replica set. It this case, controller will only send StopReplicaRequest (not LeaderAndIsrRequest) to the previous leader in the replica set shrink phase. Here is an example: {noformat} During Reassign the replica set of partition A from [B1, B2] to [B2, B3]: t0: Controller expands the replica set to [B1, B2, B3] t1: B1 receives produce request PR on partition A with acks=all and timetout T. B1 puts PR into the DelayedProducePurgatory with timeout T. t2: Controller elects B2 as the new leader and shrinks the replica set fo [B2, B3]. LeaderAndIsrRequests are sent to B2 and B3. StopReplicaRequest is sent to B!. t3: B1 receives StopReplicaRequest but doesn't try to comeplete PR. If PR cannot be fullfilled by t3, and t1 + T > t3, PR will eventually time out in the purgatory and producer will eventually time out the produce request.{noformat} Since it is possible for the leader to receive only a StopReplicaRequest (without receiving any LeaderAndIsrRequest) to leave the replica set, a fix for this issue is to also try to complete delay operations in processing StopReplicaRequest. -- This message was sent by Atlassian JIRA (v7.6.3#76005)