[ https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721137#comment-14721137 ]
ASF GitHub Bot commented on KAFKA-1387: --------------------------------------- GitHub user fpj opened a pull request: https://github.com/apache/kafka/pull/178 KAFKA-1387: Kafka getting stuck creating ephemeral node it has already created when two zookeeper sessions are established in a very short period of time This is a patch to get around the problem discussed in the KAFKA-1387 jira. The tests are not passing in my box when I run them all, but they do pass when I run them individually, which indicates that there is something leaking from a test to the next. I still need to work this out and also work on further testing this. I wanted to open this PR now so that it can start getting reviewed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fpj/kafka 1387 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/178.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #178 ---- commit f8be8657e649d0490e9ed1f1ef52234b3c31435e Author: flavio junqueira <f...@apache.org> Date: 2015-08-23T13:55:11Z KAFKA-1387: First cut, node dependency on curator commit b8f901b6478d4ac9c961e899d702e6fc11cfee07 Author: flavio junqueira <f...@apache.org> Date: 2015-08-23T13:55:11Z KAFKA-1387: First cut, node dependency on curator commit 2369e66921f88b2ee1b24ddeff2bf2d050015447 Author: flavio junqueira <f...@apache.org> Date: 2015-08-23T14:07:41Z Merge branch '1387' of https://github.com/fpj/kafka into 1387 commit f03c301d5d919d9c05c6837de508b4f383906fdb Author: flavio junqueira <f...@apache.org> Date: 2015-08-23T13:55:11Z KAFKA-1387: First cut, node dependency on curator commit d8eab9e0f569eaaecb4afda4d486d00600ad1e6f Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T14:56:01Z KAFKA-1387: Some polishing commit b7cbe5dbecbc28a564b99209114f39db785c73dd Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T15:50:58Z KAFKA-1387: Style fixes commit 336f67c641c44b73ac1dbb66cdde4ff97f2fcd9a Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T15:53:18Z KAFKA-1387: More style fixes commit 201ab2dcc33ba10a19c51f7452ce40497d3fcf83 Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T15:59:32Z Merge branch '1387' of https://github.com/fpj/kafka into 1387 commit 9961665230e04331f7767d8aa8aaac0a14f46cd8 Author: flavio junqueira <f...@apache.org> Date: 2015-08-23T13:55:11Z KAFKA-1387: First cut, node dependency on curator commit b52c12422f7a831137d8659f14779eaad1972217 Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T14:56:01Z KAFKA-1387: Some polishing commit b2400a0a37555250d50b1f1abfdda2c4d00b03ac Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T15:50:58Z KAFKA-1387: Style fixes commit 888f6e0cf17d6a3a8d6b8dd46f8099731ba36511 Author: flavio junqueira <f...@apache.org> Date: 2015-08-24T15:53:18Z KAFKA-1387: More style fixes commit d675b024b0e8627c4c2c9c113c07527851e81f7a Author: flavio junqueira <f...@apache.org> Date: 2015-08-29T15:00:07Z KAFKA-1387 commit 4c83ac2609ed29a0f1887bf5087dab50e3e93488 Author: flavio junqueira <f...@apache.org> Date: 2015-08-29T15:07:23Z KAFKA-1387: Removing whitespaces. commit 240b51a77715c53db784d5932702318ff28468c2 Author: flavio junqueira <f...@apache.org> Date: 2015-08-29T15:11:30Z Merge branch '1387' of https://github.com/fpj/kafka into 1387 ---- > Kafka getting stuck creating ephemeral node it has already created when two > zookeeper sessions are established in a very short period of time > --------------------------------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1387 > URL: https://issues.apache.org/jira/browse/KAFKA-1387 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.1.1 > Reporter: Fedor Korotkiy > Assignee: Flavio Junqueira > Priority: Blocker > Labels: newbie, patch, zkclient-problems > Attachments: KAFKA-1387.patch, kafka-1387.patch > > > Kafka broker re-registers itself in zookeeper every time handleNewSession() > callback is invoked. > https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala > > Now imagine the following sequence of events. > 1) Zookeeper session reestablishes. handleNewSession() callback is queued by > the zkClient, but not invoked yet. > 2) Zookeeper session reestablishes again, queueing callback second time. > 3) First callback is invoked, creating /broker/[id] ephemeral path. > 4) Second callback is invoked and it tries to create /broker/[id] path using > createEphemeralPathExpectConflictHandleZKBug() function. But the path is > already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting > stuck in the infinite loop. > Seems like controller election code have the same issue. > I'am able to reproduce this issue on the 0.8.1 branch from github using the > following configs. > # zookeeper > tickTime=10 > dataDir=/tmp/zk/ > clientPort=2101 > maxClientCnxns=0 > # kafka > broker.id=1 > log.dir=/tmp/kafka > zookeeper.connect=localhost:2101 > zookeeper.connection.timeout.ms=100 > zookeeper.sessiontimeout.ms=100 > Just start kafka and zookeeper and then pause zookeeper several times using > Ctrl-Z. -- This message was sent by Atlassian JIRA (v6.3.4#6332)