[ https://issues.apache.org/jira/browse/SPARK-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069692#comment-14069692 ]
Saisai Shao commented on SPARK-2383: ------------------------------------ Hi Tobias, I've also noticed this problem, seems Spark's behavior of "auto.offset.reset" is different from Kafka's original purpose, so I ask TD the original design purpose of this stuff. Here is the link: https://issues.apache.org/jira/browse/SPARK-2492. > With auto.offset.reset, KafkaReceiver potentially deletes Consumer nodes from > Zookeeper > --------------------------------------------------------------------------------------- > > Key: SPARK-2383 > URL: https://issues.apache.org/jira/browse/SPARK-2383 > Project: Spark > Issue Type: Bug > Components: Streaming > Reporter: Tobias Pfeiffer > > When auto.offset.reset is set in the Kafka configuration, then > {{KafkaReceiver}}'s {{tryZookeeperConsumerGroupCleanup()}} will delete the > whole /consume/<groupId> tree in Zookeeper before creating consumer nodes. If > there are already consumer nodes present (this may happen when multiple > KafkaReceivers in the same consumer group are launched), they are deleted as > well, leading to subsequent NoNode exceptions, for example, on rebalance. > There should be a check before the delete like {{if (zk.countChildren(dir + > "/ids") == 0) ...}} (ideally in an atomic way) in order to prevent deleting > existing consumer nodes. > (Also note that the behavior of auto.offset.reset as realized by Spark's > Kafka receiver differs from the behavior defined in Kafka's documentation.) -- This message was sent by Atlassian JIRA (v6.2#6252)