[ 
https://issues.apache.org/jira/browse/KAFKA-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Hargett updated KAFKA-989:
-------------------------------

    Status: Open  (was: Patch Available)

Not good enough.  Deadlocks because ShutdownableThread.shutdown grabs another 
lock.
                
> Race condition shutting down high-level consumer results in spinning 
> background thread
> --------------------------------------------------------------------------------------
>
>                 Key: KAFKA-989
>                 URL: https://issues.apache.org/jira/browse/KAFKA-989
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>         Environment: Ubuntu Linux x64
>            Reporter: Phil Hargett
>         Attachments: KAFKA-989-failed-to-find-leader.patch
>
>
> Running an application that uses the Kafka client under load, can often hit 
> this issue within a few hours.
> High-level consumers come and go over this application's lifecycle, but there 
> are a variety of defenses that ensure each high-level consumer lasts several 
> seconds before being shutdown.  Nevertheless, some race is causing this 
> background thread to continue long after the ZKClient it is using has been 
> disconnected.  Since the thread was spawned by a consumer that has already 
> been shutdown, the application has no way to find this thread and stop it.
> Reported on the users-kafka mailing list 6/25 as "0.8 throwing exception 
> 'Failed to find leader' and high-level consumer fails to make progress". 
> The only remedy is to shutdown the application and restart it.  Externally 
> detecting that this state has occurred is not pleasant: need to grep log for 
> repeated occurrences of the same exception.
> Stack trace:
> Failed to find leader for Set([topic6,0]): java.lang.NullPointerException
>       at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416)
>       at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413)
>       at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
>       at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413)
>       at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)
>       at kafka.utils.ZkUtils$.getChildrenParentMayNotExist(ZkUtils.scala:438)
>       at kafka.utils.ZkUtils$.getAllBrokersInCluster(ZkUtils.scala:75)
>       at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:63)
>       at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to