mumrah opened a new pull request #10561:
URL: https://github.com/apache/kafka/pull/10561


   Copied from the JIRA:
   
   > In Partition.scala, there is a race condition between the handling of an 
AlterIsrResponse and a LeaderAndIsrRequest. This is a pretty rare scenario and 
would involve the AlterIsrResponse being delayed for some time, but it is 
possible. This was observed in a test environment when lots of ISR and 
leadership changes were happening due to broker restarts.
   > 
   > When the leader handles the LeaderAndIsr, it calls Partition#makeLeader 
which overrides the isrState variable and clears the pending ISR items via 
AlterIsrManager#clearPending(TopicPartition).
   > 
   > The bug is that AlterIsrManager does not check its inflight state before 
clearing pending items. The way AlterIsrManager is designed, it retains 
inflight items in the pending items collection until the response is processed 
(to allow for retries). The result is that an inflight item is inadvertently 
removed from this collection.
   > 
   > Since the inflight item is cleared from the collection, AlterIsrManager 
allows for new AlterIsrItem-s to be enqueued for this partition even though it 
has an inflight AlterIsrItem. By allowing an update to be enqueued, Partition 
will transition its isrState to one of the inflight states (PendingIsrExpand, 
PendingIsrShrink, etc). Once the inflight partition's response is handled, it 
will fail to update the isrState due to detecting changes since the request was 
sent (which is by design). However, after the response callback is run, 
AlterIsrManager will clear the partitions that it saw in the response from the 
unsent items collection. This includes the newly added (and unsent) update.
   > 
   > The result is that Partition has a "inflight" isrState but AlterIsrManager 
does not have an unsent item for this partition. This prevents any further ISR 
updates on the partition until the next leader election (when isrState is 
reset).
   > 
   > If this bug is encountered, the workaround is to force a leader election 
which will reset the partition's state.
   
   
   This PR removes the clearPending call from AlterIsrManager. As seen with 
this bug, this method is not safe to call any time there is an AlterIsrRequest 
in-flight. We could add more protections around this call, but it is simpler 
(and safer) to just remove it. Clearing unsent ISR updates is not really 
necessary after a leader election since the updates will fail due to a stale 
leader epoch.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to