[jira] [Comment Edited] (KAFKA-3893) Kafka Borker ID disappears from /borkers/ids
[ https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616914#comment-15616914 ] Rekha Joshi edited comment on KAFKA-3893 at 11/30/16 2:58 AM: -- We saw it too in Kafka 0.8.x and upgraded to Kafka 0.9.x and continued seeing this issue, especially in AWS.This was seen in AWS especially if you configure zookeeper connect property with zookeeper ELB instead of individual IP:port set. When we reversed zookeeper connect with zookeeper IP:port instead of zookeeper ELB, it worked great. It would be great(can help) if Apache Kafka/Confluent FAQ can be updated to note this. Thanks Rekha was (Author: rekhajoshm): We saw it too in Kafka 0.8.x and upgraded to Kafka 0.9.x and continued seeing this issue.Going with workarounds.But it would be great(can help) if this is documented clearly on Apache Kafka/Confluent FAQ. Thanks Rekha > Kafka Borker ID disappears from /borkers/ids > > > Key: KAFKA-3893 > URL: https://issues.apache.org/jira/browse/KAFKA-3893 > Project: Kafka > Issue Type: Bug >Reporter: chaitra >Priority: Critical > > Kafka version used : 0.8.2.1 > Zookeeper version: 3.4.6 > We have scenario where kafka 's broker in zookeeper path /brokers/ids just > disappears. > We see the zookeeper connection active and no network issue. > The zookeeper conection timeout is set to 6000ms in server.properties > Hence Kafka not participating in cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-3893) Kafka Borker ID disappears from /borkers/ids
[ https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360306#comment-15360306 ] Peter Davis edited comment on KAFKA-3893 at 7/2/16 7:36 PM: Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka -- when a zookeeper connection is lost, it seems any other changes in the cluster during the loss (which would be expected if an outage affects multiple brokers) are not recognized when it reconnects. We see the same loop of "Shrinking ISR" and "Cached zkVerskom [###] not equal to that in zookeeper", and the broker never recovers until manually restarted. For us this happened almost daily when running on a cluster virtual machines that would get paused for a few seconds every night for a snapshot backup. We disabled the backup but it's very concerning that Kafka won't recover after a pause! Seen with 0.9 and 0.10. was (Author: davispw): Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka -- when a zookeeper connection is lost, any other changes in the cluster during the loss are not recognized when it reconnects. We see the same loop of "Shrinking ISR" and "Cached zkVerskom [###] not equal to that in zookeeper", and the broker never recovers. For us this happened almost daily when running on a cluster virtual machines that would get paused for a few seconds every night for a snapshot backup. We disabled the backup but it's very concerning that Kafka won't recover after a pause! > Kafka Borker ID disappears from /borkers/ids > > > Key: KAFKA-3893 > URL: https://issues.apache.org/jira/browse/KAFKA-3893 > Project: Kafka > Issue Type: Bug >Reporter: chaitra >Priority: Critical > > Kafka version used : 0.8.2.1 > Zookeeper version: 3.4.6 > We have scenario where kafka 's broker in zookeeper path /brokers/ids just > disappears. > We see the zookeeper connection active and no network issue. > The zookeeper conection timeout is set to 6000ms in server.properties > Hence Kafka not participating in cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-3893) Kafka Borker ID disappears from /borkers/ids
[ https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345288#comment-15345288 ] Sriharsha Chintalapani edited comment on KAFKA-3893 at 6/22/16 10:30 PM: - [~chaithrar...@gmail.com] it looks like brokers are loosing connection to zookeeper and hence the ephemeral node under /broker/ids will disappear. I advise you to set the connection timeout 3 ms . I advise you to post your questions in kafka mailing lists before opening a JIRA as it doesn't look like a bug in kafka was (Author: sriharsha): [~chaithrar...@gmail.com] it looks like brokers are loosing connection to zookeeper and hence the ephemeral node under /broker/ids will disappear. I advise you to set the connection timeout 3 ms . I advise you post your questions in kafka mailing lists before opening a JIRA as it doesn't look like a bug in kafka > Kafka Borker ID disappears from /borkers/ids > > > Key: KAFKA-3893 > URL: https://issues.apache.org/jira/browse/KAFKA-3893 > Project: Kafka > Issue Type: Bug >Reporter: chaitra >Priority: Critical > > Kafka version used : 0.8.2.1 > Zookeeper version: 3.4.6 > We have scenario where kafka 's broker in zookeeper path /brokers/ids just > disappears. > We see the zookeeper connection active and no network issue. > The zookeeper conection timeout is set to 6000ms in server.properties > Hence Kafka not participating in cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)