Jun Rao created KAFKA-7837:
------------------------------

             Summary: maybeShrinkIsr may not reflect OfflinePartitions 
immediately
                 Key: KAFKA-7837
                 URL: https://issues.apache.org/jira/browse/KAFKA-7837
             Project: Kafka
          Issue Type: Improvement
            Reporter: Jun Rao


When a partition is marked offline due to a failed disk, the leader is supposed 
to not shrink its ISR any more. In ReplicaManager.maybeShrinkIsr(), we iterate 
through all non-offline partitions to shrink the ISR. If an ISR needs to 
shrink, we need to write the new ISR to ZK, which can take a bit of time. In 
this window, some partitions could now be marked as offline, but may not be 
picked up by the iterator since it only reflects the state at that point. This 
can cause all in-sync followers to be dropped out of ISR unnecessarily and 
prevents a clean leader election.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to