[ https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640384#comment-16640384 ]
Adam Elliott commented on KAFKA-5200: ------------------------------------- My team runs a multi-tenant Kafka cluster with a lot of diverse uses, and one of the services we provide is an API for managed topic creation/deletion. The cluster is large (> 100 nodes) and so it's pretty likely that, for whatever reason, at least one node will be down at any given point--and sometimes for extended periods. We're currently struggling with the behaviour described above. From what I can see in the source, this is intentional behaviour. We don't have control over when clients choose to delete topics, so we can't reasonably block deletions for reasons that they would see as arbitrary ("some backend server is down, try again later"). The open source partition reassignment tool _does_ work, as of the version we're using at least, to move replicas off of dead brokers, but only if the topic hasn't already been deleted. If it has, the only remedy is manual surgery to Zookeeper state and bouncing the controller. There's one additional factor which makes this bug worse: if too many topics are "half-deleted" at once, the controller crashes/becomes unresponsive; at which point a minor annoyance for one of our customers becomes something much more serious. I've had a look at the various deletion related state machines and I don't see an easy fix. I also haven't seen much mention or discussion of this problem apart from this issue. > If a replicated topic is deleted with one broker down, it can't be recreated > ---------------------------------------------------------------------------- > > Key: KAFKA-5200 > URL: https://issues.apache.org/jira/browse/KAFKA-5200 > Project: Kafka > Issue Type: Improvement > Components: core > Reporter: Edoardo Comar > Priority: Major > > In a cluster with 5 broker, replication factor=3, min in sync=2, > one broker went down > A user's app remained of course unaware of that and deleted a topic that > (unknowingly) had a replica on the dead broker. > The topic went in 'pending delete' mode > The user then tried to recreate the topic - which failed, so his app was left > stuck - no working topic and no ability to create one. > The reassignment tool fails to move the replica out of the dead broker - > specifically because the broker with the partition replica to move is dead :-) > Incidentally the confluent-rebalancer docs say > http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster > > Supports moving partitions away from dead brokers > It'd be nice to similarly improve the opensource reassignment tool -- This message was sent by Atlassian JIRA (v7.6.3#76005)