[ https://issues.apache.org/jira/browse/KAFKA-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-1907: --------------------------------- Labels: zkclient-problems (was: ) > ZkClient can block controlled shutdown indefinitely > --------------------------------------------------- > > Key: KAFKA-1907 > URL: https://issues.apache.org/jira/browse/KAFKA-1907 > Project: Kafka > Issue Type: Bug > Components: core, zkclient > Affects Versions: 0.8.2.0 > Reporter: Ewen Cheslack-Postava > Assignee: jaikiran pai > Labels: zkclient-problems > Attachments: KAFKA-1907.patch > > > There are some calls to ZkClient via ZkUtils in > KafkaServer.controlledShutdown() that can block indefinitely because they > internally call waitUntilConnected. The ZkClient API doesn't provide an > alternative with timeouts, so fixing this will require enforcing timeouts in > some other way. > This may be a more general issue if there are any non daemon threads that > also call ZkUtils methods. > Stacktrace showing the issue: > {code} > "Thread-2" prio=10 tid=0xb3305000 nid=0x4758 waiting on condition [0x6ad69000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x70a93368> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:267) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUntil(AbstractQueuedSynchronizer.java:2130) > at org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:636) > at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:619) > at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:615) > at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:679) > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) > at kafka.utils.ZkUtils$.readDataMaybeNull(ZkUtils.scala:456) > at kafka.utils.ZkUtils$.getController(ZkUtils.scala:65) > at > kafka.server.KafkaServer.kafka$server$KafkaServer$$controlledShutdown(KafkaServer.scala:194) > at > kafka.server.KafkaServer$$anonfun$shutdown$1.apply$mcV$sp(KafkaServer.scala:269) > at kafka.utils.Utils$.swallow(Utils.scala:172) > at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) > at kafka.utils.Utils$.swallowWarn(Utils.scala:45) > at kafka.utils.Logging$class.swallow(Logging.scala:94) > at kafka.utils.Utils$.swallow(Utils.scala:45) > at kafka.server.KafkaServer.shutdown(KafkaServer.scala:269) > at > kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:42) > at kafka.Kafka$$anon$1.run(Kafka.scala:42) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)