[ https://issues.apache.org/jira/browse/KAFKA-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-9089. ------------------------------------ Resolution: Fixed > Reassignment should be resilient to unexpected errors > ----------------------------------------------------- > > Key: KAFKA-9089 > URL: https://issues.apache.org/jira/browse/KAFKA-9089 > Project: Kafka > Issue Type: Bug > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Major > Fix For: 2.4.0 > > > Reassignment changes typically involve both an update to the assignment state > in zookeeper and an update to the in-memory representation of that state (in > the ControllerContext). We can run into trouble when these states get > inconsistent with each other, so the reassignment logic attempts to follow > some rules to reduce the impact from this: > * When creating a new reassignment, we update the state in zookeeper first > before updating memory. Until the reassignment is known to be persisted, we > do not begin executing any reassignment logic. > * When completing a reassignment, all of the completion steps are executed > before the state is updated in zookeeper. In the event of a failure, the new > controller can retry reassignment completion. > However, the current logic does not follow these rules strictly which can > lead to state inconsistencies in the case of an unexpected error. > # When we override or cancel an existing assignment, we currently use an > intermediate assignment state which is only reflected in memory. It is > basically a mix of the previous assignment state and the overlapping parts of > the new reassignment. The purpose of this is to shutdown unneeded replicas > from the existing reassignment. Since the intermediate state is not > persisted, a controller failure will revert to the old reassignment. Any > exception which does not cause a controller failure will result in state > divergence. > # The target replicas of a reassignment are represented both in the existing > assignment (PartitionReplicaAssignment) and in a separate context object > (ReassignedPartitionContext). The reassignment context is updated before a > reassignment has been accepted and persisted. The intent is to remove this > context object in the event of a submission failure, but an unexpected error > will leave it around. > We can make reassignment more resilient to unexpected errors by using > consistent update invariants. Specifically we can remove the intermediate > assignment state and enforce the invariant that any active reassignment must > be persisted before being reflected in memory. Additionally, we can make the > assignment state the source of truth for the target replicas and eliminate > the possibility of inconsistency. Doing so simplifies the reassignment logic > and makes it more resilient. -- This message was sent by Atlassian Jira (v8.3.4#803005)