Jason Gustafson created KAFKA-9089:
--------------------------------------
Summary: Reassignment should be resilient to unexpected errors
Key: KAFKA-9089
URL: https://issues.apache.org/jira/browse/KAFKA-9089
Project: Kafka
Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson
Fix For: 2.4.0
Reassignment changes typically involve both an update to the assignment state
in zookeeper and an update to the in-memory representation of that state (in
the ControllerContext). We can run into trouble when these states get
inconsistent with each other, so the reassignment logic attempts to follow some
rules to reduce the impact from this:
* When creating a new reassignment, we update the state in zookeeper first
before updating memory. Until the reassignment is known to be persisted, we do
not begin executing any reassignment logic.
* When completing a reassignment, all of the completion steps are executed
before the state is updated in zookeeper. In the event of a failure, the new
controller can retry reassignment completion.
However, the current logic does not follow these rules strictly which can lead
to state inconsistencies in the case of an unexpected error.
# When we override or cancel an existing assignment, we currently use an
intermediate assignment state which is only reflected in memory. It is
basically a mix of the previous assignment state and the overlapping parts of
the new reassignment. The purpose of this is to shutdown unneeded replicas from
the existing reassignment. Since the intermediate state is not persisted, a
controller failure will revert to the old reassignment. Any exception which
does not cause a controller failure will result in state divergence.
# The target replicas of a reassignment are represented both in the existing
assignment (PartitionReplicaAssignment) and in a separate context object
(ReassignedPartitionContext). The reassignment context is updated before a
reassignment has been accepted and persisted. The intent is to remove this
context object in the event of a submission failure, but an unexpected error
will leave it around.
We can make reassignment more resilient to unexpected errors by using
consistent update invariants. Specifically we can remove the intermediate
assignment state and enforce the invariant that any active reassignment must be
persisted before being reflected in memory. Additionally, we can make the
assignment state the source of truth for the target replicas and eliminate the
possibility of inconsistency. Doing so simplifies the reassignment logic and
makes it more resilient.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)