Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog? is applicable.
Thanks, Jun On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. <ahmed.ham...@gmail.com> wrote: > I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1, and > Zookeeper 3.4.5. The activity on this machine isn't massive...I would say > the Kafka queues get a consistent 1 message every 2-3 seconds, as well as > occasional spikes, but still nothing large enough to push the limits. Both > Kafka and Zookeeper are running on the same machine. > > Occasionally, a rebalance is triggered, which causes our Kafka clients to > try reconnecting several times, but it ultimately fails with the following > error: > > > 04:56:10,020 INFO [kafka.consumer.ZookeeperConsumerConnector] > (alarms.topology.updates_<host>-1383643783747-c7775701_watcher_executor) > [alarms.topology.updates_<host>-1383643783747-c7775701], exception > during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode > = NoNode for > /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701 > at > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) > [zkclient-0.3.jar:0.3] > at > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) > [zkclient-0.3.jar:0.3] > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) > [zkclient-0.3.jar:0.3] > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) > [zkclient-0.3.jar:0.3] > at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407) > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] > at > kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52) > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] > at > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401) > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] > at > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374) > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78) > [scala-library-2.9.2.jar:] > at > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369) > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] > at > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326) > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] > Caused by: org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for > > /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > [zookeeper-3.4.3.jar:3.4.3-1240972] > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > [zookeeper-3.4.3.jar:3.4.3-1240972] > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131) > [zookeeper-3.4.3.jar:3.4.3-1240972] > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160) > [zookeeper-3.4.3.jar:3.4.3-1240972] > at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103) > [zkclient-0.3.jar:0.3] > at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770) > [zkclient-0.3.jar:0.3] > at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766) > [zkclient-0.3.jar:0.3] > at > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) > [zkclient-0.3.jar:0.3] > ... 9 more > > > Our Kafka consumers are written in Clojure ( > https://github.com/pingles/clj-kafka). > > Any ideas on what can cause such behaviour? The rebalances themselves > happen sporadically, but when they do, they sometimes fail and an error > like the one above is shown. I'm not sure if this is a Kafka or Zookeeper > problem at this point, but any help would be appreciated. > > Thanks >