[ 
https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360306#comment-15360306
 ] 

Peter Davis commented on KAFKA-3893:
------------------------------------

Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka 
-- when a zookeeper connection is lost, any other changes in the cluster during 
the loss are not recognized when it reconnects.  We see the same loop of 
"Shrinking ISR" and "Cached zkVerskom [###] not equal to that in zookeeper", 
and the broker never recovers. 

For us this happened almost daily when running on a cluster virtual machines 
that would get paused for a few seconds every night for a snapshot backup.  We 
disabled the backup but it's very concerning that Kafka won't recover after a 
pause!

> Kafka Borker ID disappears from /borkers/ids
> --------------------------------------------
>
>                 Key: KAFKA-3893
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3893
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: chaitra
>            Priority: Critical
>
> Kafka version used : 0.8.2.1 
> Zookeeper version: 3.4.6
> We have scenario where kafka 's broker in  zookeeper path /brokers/ids just 
> disappears.
> We see the zookeeper connection active and no network issue.
> The zookeeper conection timeout is set to 6000ms in server.properties
> Hence Kafka not participating in cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to