[
https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360306#comment-15360306
]
Peter Davis commented on KAFKA-3893:
------------------------------------
Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka
-- when a zookeeper connection is lost, any other changes in the cluster during
the loss are not recognized when it reconnects. We see the same loop of
"Shrinking ISR" and "Cached zkVerskom [###] not equal to that in zookeeper",
and the broker never recovers.
For us this happened almost daily when running on a cluster virtual machines
that would get paused for a few seconds every night for a snapshot backup. We
disabled the backup but it's very concerning that Kafka won't recover after a
pause!
> Kafka Borker ID disappears from /borkers/ids
> --------------------------------------------
>
> Key: KAFKA-3893
> URL: https://issues.apache.org/jira/browse/KAFKA-3893
> Project: Kafka
> Issue Type: Bug
> Reporter: chaitra
> Priority: Critical
>
> Kafka version used : 0.8.2.1
> Zookeeper version: 3.4.6
> We have scenario where kafka 's broker in zookeeper path /brokers/ids just
> disappears.
> We see the zookeeper connection active and no network issue.
> The zookeeper conection timeout is set to 6000ms in server.properties
> Hence Kafka not participating in cluster
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)