[ 
https://issues.apache.org/jira/browse/KAFKA-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-9089.
------------------------------------
    Resolution: Fixed

> Reassignment should be resilient to unexpected errors
> -----------------------------------------------------
>
>                 Key: KAFKA-9089
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9089
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>             Fix For: 2.4.0
>
>
> Reassignment changes typically involve both an update to the assignment state 
> in zookeeper and an update to the in-memory representation of that state (in 
> the ControllerContext). We can run into trouble when these states get 
> inconsistent with each other, so the reassignment logic attempts to follow 
> some rules to reduce the impact from this:
>  * When creating a new reassignment, we update the state in zookeeper first 
> before updating memory. Until the reassignment is known to be persisted, we 
> do not begin executing any reassignment logic.
>  * When completing a reassignment, all of the completion steps are executed 
> before the state is updated in zookeeper. In the event of a failure, the new 
> controller can retry reassignment completion.
> However, the current logic does not follow these rules strictly which can 
> lead to state inconsistencies in the case of an unexpected error.
>  # When we override or cancel an existing assignment, we currently use an 
> intermediate assignment state which is only reflected in memory. It is 
> basically a mix of the previous assignment state and the overlapping parts of 
> the new reassignment. The purpose of this is to shutdown unneeded replicas 
> from the existing reassignment. Since the intermediate state is not 
> persisted, a controller failure will revert to the old reassignment. Any 
> exception which does not cause a controller failure will result in state 
> divergence.
>  # The target replicas of a reassignment are represented both in the existing 
> assignment (PartitionReplicaAssignment) and in a separate context object 
> (ReassignedPartitionContext). The reassignment context is updated before a 
> reassignment has been accepted and persisted. The intent is to remove this 
> context object in the event of a submission failure, but an unexpected error 
> will leave it around.
> We can make reassignment more resilient to unexpected errors by using 
> consistent update invariants. Specifically we can remove the intermediate 
> assignment state and enforce the invariant that any active reassignment must 
> be persisted before being reflected in memory. Additionally, we can make the 
> assignment state the source of truth for the target replicas and eliminate 
> the possibility of inconsistency. Doing so simplifies the reassignment logic 
> and makes it more resilient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to