Neha Narkhede created KAFKA-1097:
------------------------------------
Summary: Race condition while reassigning partition leads to
incorrect ISR information in zookeeper
Key: KAFKA-1097
URL: https://issues.apache.org/jira/browse/KAFKA-1097
Project: Kafka
Issue Type: Bug
Components: controller
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
While moving partitions, the controller moves the old replicas through the
following state changes -
ONLINE -> OFFLINE -> NON_EXISTENT
During the offline state change, the controller removes the old replica and
writes the updated ISR to zookeeper and notifies the leader. Note that it
doesn't notify the old replicas to stop fetching from the leader (to be fixed
in KAFKA-1032). During the non-existent state change, the controller does not
write the updated ISR or replica list to zookeeper. Right after the
non-existent state change, the controller writes the new replica list to
zookeeper, but does not update the ISR. So an old replica can send a fetch
request after the offline state change, essentially letting the leader add it
back to the ISR. That lets a non existent replica live in the ISR
--
This message was sent by Atlassian JIRA
(v6.1#6144)