[
https://issues.apache.org/jira/browse/KAFKA-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dong Lin resolved KAFKA-6618.
-----------------------------
Resolution: Duplicate
Duplicate of https://issues.apache.org/jira/browse/KAFKA-6082
> Prevent two controllers from updating znodes concurrently
> ---------------------------------------------------------
>
> Key: KAFKA-6618
> URL: https://issues.apache.org/jira/browse/KAFKA-6618
> Project: Kafka
> Issue Type: Bug
> Reporter: Dong Lin
> Assignee: Dong Lin
> Priority: Major
>
> Kafka controller may fail to function properly (even after repeated
> controller movement) due to the following sequence of events:
> - User requests topic deletion
> - Controller A deletes the partition znode
> - Controller B becomes controller and reads the topic znode
> - Controller A deletes the topic znode and remove the topic from the topic
> deletion znode
> - Controller B reads the partition znode and topic deletion znode
> - According to controller B's context, the topic znode exists, the topic is
> not listed for deletion, and some partition is not found for the given topic.
> Then controller B will create topic znode with empty data (i.e. partition
> assignment) and create the partition znodes.
> - All controller after controller B will fail because there is not data in
> the topic znode.
> The long term solution is to have a way to prevent old controller from
> writing to zookeeper if it is not the active controller. One idea is to use
> the zookeeper multi API (See
> [https://zookeeper.apache.org/doc/r3.4.3/api/org/apache/zookeeper/ZooKeeper.html#multi(java.lang.Iterable))]
> such that controller only writes to zookeeper if the zk version of the
> controller znode has not been changed.
> The short term solution is to let controller reads the topic deletion znode
> first. If the topic is still listed in the topic deletion znode, then the new
> controller will properly handle partition states of this topic without
> creating partition znodes for this topic. And if the topic is not listed in
> the topic deletion znode, then both the topic znode and the partition znodes
> of this topic should have been deleted by the time the new controller tries
> to read them.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)