Zhanxiang (Patrick) Huang created KAFKA-8571:
------------------------------------------------
Summary: Not complete delayed produce requests when processing
StopReplicaRequest causing high produce latency for acks=all
Key: KAFKA-8571
URL: https://issues.apache.org/jira/browse/KAFKA-8571
Project: Kafka
Issue Type: Bug
Reporter: Zhanxiang (Patrick) Huang
Assignee: Zhanxiang (Patrick) Huang
Currently a broker will only attempt to complete delayed requests upon
highwater mark changes and receiving LeaderAndIsrRequest. When a broker
receives StopReplicaRequest, it will not try to complete delayed operations
including delayed produce for acks=all, which can cause the producer to timeout
even though the producer should have attempted to talk to the new leader faster
if a NotLeaderForPartition error is sent.
This can happen during partition reassignment when controller is trying to kick
the previous leader out of the replica set. It this case, controller will only
send StopReplicaRequest (not LeaderAndIsrRequest) to the previous leader in the
replica set shrink phase. Here is an example:
{noformat}
During Reassign the replica set of partition A from [B1, B2] to [B2, B3]:
t0: Controller expands the replica set to [B1, B2, B3]
t1: B1 receives produce request PR on partition A with acks=all and timetout T.
B1 puts PR into the DelayedProducePurgatory with timeout T.
t2: Controller elects B2 as the new leader and shrinks the replica set fo [B2,
B3]. LeaderAndIsrRequests are sent to B2 and B3. StopReplicaRequest is sent to
B!.
t3: B1 receives StopReplicaRequest but doesn't try to comeplete PR.
If PR cannot be fullfilled by t3, and t1 + T > t3, PR will eventually time out
in the purgatory and producer will eventually time out the produce
request.{noformat}
Since it is possible for the leader to receive only a StopReplicaRequest
(without receiving any LeaderAndIsrRequest) to leave the replica set, a fix for
this issue is to also try to complete delay operations in processing
StopReplicaRequest.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)