[
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711838#comment-14711838
]
Guozhang Wang commented on KAFKA-1387:
--------------------------------------
Thanks [~fpj], thanks for the patch. Here are some high-level comments:
1. Will the mixing usage of ZK directly and ZkClient together violate ordering?
AFAIK ZkClient orders all events fired by watchers and hand them to the user
callbacks one-by-one, if we use ZK's Watcher directly will its callback be
called out-of-order with other events?
2. If we get a Code.OK in CreateCallback, do we still need to trigger a
ZooKeeper.exist with ExistsCallback again?
3. For the consumer / server registration case particularly, we tries to handle
parent path creation in ZkUtils.makeSurePersistentPathExists, so I feel we
should expose the problem that parent path does not exist yet instead trying to
hide it in createRecursive.
> Kafka getting stuck creating ephemeral node it has already created when two
> zookeeper sessions are established in a very short period of time
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-1387
> URL: https://issues.apache.org/jira/browse/KAFKA-1387
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8.1.1
> Reporter: Fedor Korotkiy
> Assignee: Flavio Junqueira
> Priority: Blocker
> Labels: newbie, patch, zkclient-problems
> Attachments: KAFKA-1387.patch, kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession()
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using
> Ctrl-Z.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)