Dhruvil Shah created KAFKA-9961:
-----------------------------------
Summary: Brokers may be left in an inconsistent state after
reassignment
Key: KAFKA-9961
URL: https://issues.apache.org/jira/browse/KAFKA-9961
Project: Kafka
Issue Type: Bug
Reporter: Dhruvil Shah
When completing a reassignment, the controller sends StopReplicaRequest to
replicas that are not in the target assignment and removes them from the
assignment in ZK. We do not have any retry mechanism to ensure that the broker
is able to process the StopReplicaRequest successfully. Under certain
circumstances, this could leave brokers in an inconsistent state, where they
continue being the follower for this partition and end up with an inconsistent
metadata cache.
We have seen messages like the following being spammed in the broker logs when
we get into this situation:
{code:java}
While recording the replica LEO, the partition topic-1 hasn't been created.
{code}
This happens because the broker has not an updated LeaderAndIsrRequest for the
new leader nor a StopReplicaRequest from the controller when the replica was
removed from the assignment.
Note that we would require a restart of the affected broker to fix this
situation. A controller failover would not fix it as the broker could continue
being a replica for the partition until it receives a StopReplicaRequest, which
would never happen in this case.
There seem to be couple of problems we should address:
# We need a mechanism to retry replica deletions after partition reassignment
is complete. The main challenge here is to be able to deal with cases where a
broker has been decommissioned and may never come back up.
# We could perhaps consider a mechanism to reconcile replica states across
brokers, something similar to the solution proposed inĀ
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)