[ https://issues.apache.org/jira/browse/KAFKA-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752790#comment-16752790 ]
ASF GitHub Bot commented on KAFKA-7837: --------------------------------------- dhruvilshah3 commented on pull request #6202: KAFKA-7837: Ensure we do not shrink ISR for offline partitions URL: https://github.com/apache/kafka/pull/6202 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > maybeShrinkIsr may not reflect OfflinePartitions immediately > ------------------------------------------------------------ > > Key: KAFKA-7837 > URL: https://issues.apache.org/jira/browse/KAFKA-7837 > Project: Kafka > Issue Type: Improvement > Reporter: Jun Rao > Assignee: Dhruvil Shah > Priority: Major > > When a partition is marked offline due to a failed disk, the leader is > supposed to not shrink its ISR any more. In ReplicaManager.maybeShrinkIsr(), > we iterate through all non-offline partitions to shrink the ISR. If an ISR > needs to shrink, we need to write the new ISR to ZK, which can take a bit of > time. In this window, some partitions could now be marked as offline, but may > not be picked up by the iterator since it only reflects the state at that > point. This can cause all in-sync followers to be dropped out of ISR > unnecessarily and prevents a clean leader election. -- This message was sent by Atlassian JIRA (v7.6.3#76005)