[
https://issues.apache.org/jira/browse/KAFKA-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664535#comment-15664535
]
ASF GitHub Bot commented on KAFKA-4409:
---------------------------------------
GitHub user jjkoshy opened a pull request:
https://github.com/apache/kafka/pull/2129
KAFKA-4409; Fix deadlock between topic event handling and shutdown in…
The consumer can deadlock on shutdown if a topic event fires during
shutdown. The shutdown acquires the rebalance lock and then the
topic-event-watcher lock. The topic event watcher acquires these in the reverse
order. Shutdown should not need to acquire the topic-event-watcher’s lock - all
it does is unsubscribes from topic events.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jjkoshy/kafka KAFKA-4409
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/2129.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2129
----
commit 2a5a276822452b03d816c3ae2880a7b7bf15ea46
Author: Joel Koshy <[email protected]>
Date: 2016-11-14T17:48:16Z
KAFKA-4409; Fix deadlock between topic event handling and shutdown in the
old consumer.
----
> ZK consumer shutdown/topic event deadlock
> -----------------------------------------
>
> Key: KAFKA-4409
> URL: https://issues.apache.org/jira/browse/KAFKA-4409
> Project: Kafka
> Issue Type: Bug
> Reporter: Joel Koshy
>
> This only applies to the old zookeeper consumer. It is trivial enough to fix.
> The consumer can deadlock on shutdown if a topic event fires during shutdown.
> The shutdown acquires the rebalance lock and then the topic-event-watcher
> lock. The topic event watcher acquires these in the reverse order. Shutdown
> should not need to acquire the topic-event-watcher’s lock - all it does is
> unsubscribes from topic events.
> Stack trace:
> {noformat}
> "mirrormaker-thread-0":
> at
> kafka.consumer.ZookeeperTopicEventWatcher.shutdown(ZookeeperTopicEventWatcher.scala:50)
> - waiting to lock <0x000000072a65d508> (a java.lang.Object)
> at
> kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:216)
> - locked <0x00000007103c69c0> (a java.lang.Object)
> at
> kafka.tools.MirrorMaker$MirrorMakerOldConsumer.cleanup(MirrorMaker.scala:519)
> at
> kafka.tools.MirrorMaker$MirrorMakerThread$$anonfun$run$3.apply$mcV$sp(MirrorMaker.scala:441)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:76)
> at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
> at kafka.utils.CoreUtils$.swallowWarn(CoreUtils.scala:47)
> at kafka.utils.Logging$class.swallow(Logging.scala:94)
> at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:47)
> at
> kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:441)
> "ZkClient-EventThread-58-<zkconnect>":
> at
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:639)
> - waiting to lock <0x00000007103c69c0> (a java.lang.Object)
> at
> kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:982)
> at
> kafka.consumer.ZookeeperConsumerConnector$WildcardStreamsHandler.handleTopicEvent(ZookeeperConsumerConnector.scala:1048)
> at
> kafka.consumer.ZookeeperTopicEventWatcher$ZkTopicEventListener.liftedTree1$1(ZookeeperTopicEventWatcher.scala:69)
> at
> kafka.consumer.ZookeeperTopicEventWatcher$ZkTopicEventListener.handleChildChange(ZookeeperTopicEventWatcher.scala:65)
> - locked <0x000000072a65d508> (a java.lang.Object)
> at org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> Found one Java-level deadlock:
> =============================
> "mirrormaker-thread-0":
> waiting to lock monitor 0x00007f1f38029748 (object 0x000000072a65d508, a
> java.lang.Object),
> which is held by "ZkClient-EventThread-58-<zkconnect>"
> "ZkClient-EventThread-58-<zkconnect>":
> waiting to lock monitor 0x00007f1e900249a8 (object 0x00000007103c69c0, a
> java.lang.Object),
> which is held by "mirrormaker-thread-0"
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)