Re: Kafka rebalancing causes Zookeeper to fail

2014-01-23 Thread Ahmed H.
When you say use a larger session timeout, which session timeout do you
refer to? Is it the zookeeper session timeout variable that we define when
creating a Kafka consumer? Or is there a different session timeout?

As for downgrading, that is currently not an option for the time being, so
I will have to have some better debugging tools to pinpoint the cause.

Thanks


On Wed, Jan 22, 2014 at 11:44 PM, Jun Rao jun...@gmail.com wrote:

 You can find some of the GC settings in
 https://cwiki.apache.org/confluence/display/KAFKA/Operations

 There were some ZK bugs exposed during session expiration, which were fixed
 in 3.3.4. Not sure if 3.4.5 exposes any new issues. The easiest thing is
 probably to avoid GC-induced ZK session timeout in the first place or use a
 larger session timeout.

 Thanks,

 Jun


 On Wed, Jan 22, 2014 at 8:29 AM, Ahmed H. ahmed.ham...@gmail.com wrote:

  Hello,
 
  I looked at that, not sure if it is applicable or not at this point. We
  used to have frequent rebalances, but that issue was mitigated by
  increasing the zktimeout on the consumer side. With that said, it may
 still
  be a problem. I have't collected any metrics concerning rebalances in a
  while. I will certainly take a look at our current GC settings. What are
  typical settings that we should have for GC (I am not sure of what
 exactly
  I'm looking for)?
 
  As for downgrading the Zookeeper version, would there be any major loss
 of
  functionality? Version 3.4.5 is currently stable, so I am unsure of how
 it
  would help. I can try it and let it soak for a while to see if it helps
 or
  not. The problem is we have many components that tie into Zookeeper and
 I'm
  worried that downgrading may break some of our API calls to it.
 
  Is there a good way of trying to narrow this problem down further?
 
  Thanks again
 
 
  On Wed, Jan 22, 2014 at 10:15 AM, Jun Rao jun...@gmail.com wrote:
 
   Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if
  
  
 
 https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
   ?
   is applicable.
  
   Thanks,
  
   Jun
  
  
   On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. ahmed.ham...@gmail.com
  wrote:
  
I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1,
  and
Zookeeper 3.4.5. The activity on this machine isn't massive...I would
  say
the Kafka queues get a consistent 1 message every 2-3 seconds, as
 well
  as
occasional spikes, but still nothing large enough to push the limits.
   Both
Kafka and Zookeeper are running on the same machine.
   
Occasionally, a rebalance is triggered, which causes our Kafka
 clients
  to
try reconnecting several times, but it ultimately fails with the
   following
error:
   
   
04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
   
  (alarms.topology.updates_host-1383643783747-c7775701_watcher_executor)
[alarms.topology.updates_host-1383643783747-c7775701], exception
during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
= NoNode for
   
  
 
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
at
org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
[zkclient-0.3.jar:0.3]
at
org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
[zkclient-0.3.jar:0.3]
at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
[zkclient-0.3.jar:0.3]
at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
[zkclient-0.3.jar:0.3]
at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
[kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at
kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
[kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at
   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
[kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at
   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
[kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at
   scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
[scala-library-2.9.2.jar:]
at
   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
[kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at
   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326)
[kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:

Re: Kafka rebalancing causes Zookeeper to fail

2014-01-23 Thread Jun Rao
zookeeper.session.timeout.ms  in consumer config.
Thanks,
Jun


On Thu, Jan 23, 2014 at 11:24 AM, Ahmed H. ahmed.ham...@gmail.com wrote:

 When you say use a larger session timeout, which session timeout do you
 refer to? Is it the zookeeper session timeout variable that we define when
 creating a Kafka consumer? Or is there a different session timeout?

 As for downgrading, that is currently not an option for the time being, so
 I will have to have some better debugging tools to pinpoint the cause.

 Thanks


 On Wed, Jan 22, 2014 at 11:44 PM, Jun Rao jun...@gmail.com wrote:

  You can find some of the GC settings in
  https://cwiki.apache.org/confluence/display/KAFKA/Operations
 
  There were some ZK bugs exposed during session expiration, which were
 fixed
  in 3.3.4. Not sure if 3.4.5 exposes any new issues. The easiest thing is
  probably to avoid GC-induced ZK session timeout in the first place or
 use a
  larger session timeout.
 
  Thanks,
 
  Jun
 
 
  On Wed, Jan 22, 2014 at 8:29 AM, Ahmed H. ahmed.ham...@gmail.com
 wrote:
 
   Hello,
  
   I looked at that, not sure if it is applicable or not at this point. We
   used to have frequent rebalances, but that issue was mitigated by
   increasing the zktimeout on the consumer side. With that said, it may
  still
   be a problem. I have't collected any metrics concerning rebalances in a
   while. I will certainly take a look at our current GC settings. What
 are
   typical settings that we should have for GC (I am not sure of what
  exactly
   I'm looking for)?
  
   As for downgrading the Zookeeper version, would there be any major loss
  of
   functionality? Version 3.4.5 is currently stable, so I am unsure of how
  it
   would help. I can try it and let it soak for a while to see if it helps
  or
   not. The problem is we have many components that tie into Zookeeper and
  I'm
   worried that downgrading may break some of our API calls to it.
  
   Is there a good way of trying to narrow this problem down further?
  
   Thanks again
  
  
   On Wed, Jan 22, 2014 at 10:15 AM, Jun Rao jun...@gmail.com wrote:
  
Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if
   
   
  
 
 https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
?
is applicable.
   
Thanks,
   
Jun
   
   
On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. ahmed.ham...@gmail.com
   wrote:
   
 I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta
 1,
   and
 Zookeeper 3.4.5. The activity on this machine isn't massive...I
 would
   say
 the Kafka queues get a consistent 1 message every 2-3 seconds, as
  well
   as
 occasional spikes, but still nothing large enough to push the
 limits.
Both
 Kafka and Zookeeper are running on the same machine.

 Occasionally, a rebalance is triggered, which causes our Kafka
  clients
   to
 try reconnecting several times, but it ultimately fails with the
following
 error:


 04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]

  
 (alarms.topology.updates_host-1383643783747-c7775701_watcher_executor)
 [alarms.topology.updates_host-1383643783747-c7775701], exception
 during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
 org.apache.zookeeper.KeeperException$NoNodeException:
 KeeperErrorCode
 = NoNode for

   
  
 
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
 at

 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 [zkclient-0.3.jar:0.3]
 at
 org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 [zkclient-0.3.jar:0.3]
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 [zkclient-0.3.jar:0.3]
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 [zkclient-0.3.jar:0.3]
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at
 kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at

   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at

   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at
scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
 [scala-library-2.9.2.jar:]
 at

   
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
 

Re: Kafka rebalancing causes Zookeeper to fail

2014-01-22 Thread Jun Rao
Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog?
is applicable.

Thanks,

Jun


On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. ahmed.ham...@gmail.com wrote:

 I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1, and
 Zookeeper 3.4.5. The activity on this machine isn't massive...I would say
 the Kafka queues get a consistent 1 message every 2-3 seconds, as well as
 occasional spikes, but still nothing large enough to push the limits. Both
 Kafka and Zookeeper are running on the same machine.

 Occasionally, a rebalance is triggered, which causes our Kafka clients to
 try reconnecting several times, but it ultimately fails with the following
 error:


 04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
 (alarms.topology.updates_host-1383643783747-c7775701_watcher_executor)
 [alarms.topology.updates_host-1383643783747-c7775701], exception
 during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
 = NoNode for
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
 at
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 [zkclient-0.3.jar:0.3]
 at
 org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 [zkclient-0.3.jar:0.3]
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 [zkclient-0.3.jar:0.3]
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 [zkclient-0.3.jar:0.3]
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at
 kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
 [scala-library-2.9.2.jar:]
 at
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 at
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326)
 [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
 KeeperErrorCode = NoNode for

 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
 at
 org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 [zookeeper-3.4.3.jar:3.4.3-1240972]
 at
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 [zookeeper-3.4.3.jar:3.4.3-1240972]
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
 [zookeeper-3.4.3.jar:3.4.3-1240972]
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
 [zookeeper-3.4.3.jar:3.4.3-1240972]
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 [zkclient-0.3.jar:0.3]
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 [zkclient-0.3.jar:0.3]
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 [zkclient-0.3.jar:0.3]
 at
 org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 [zkclient-0.3.jar:0.3]
 ... 9 more


 Our Kafka consumers are written in Clojure (
 https://github.com/pingles/clj-kafka).

 Any ideas on what can cause such behaviour? The rebalances themselves
 happen sporadically, but when they do, they sometimes fail and an error
 like the one above is shown. I'm not sure if this is a Kafka or Zookeeper
 problem at this point, but any help would be appreciated.

 Thanks



Re: Kafka rebalancing causes Zookeeper to fail

2014-01-22 Thread Ahmed H.
Hello,

I looked at that, not sure if it is applicable or not at this point. We
used to have frequent rebalances, but that issue was mitigated by
increasing the zktimeout on the consumer side. With that said, it may still
be a problem. I have't collected any metrics concerning rebalances in a
while. I will certainly take a look at our current GC settings. What are
typical settings that we should have for GC (I am not sure of what exactly
I'm looking for)?

As for downgrading the Zookeeper version, would there be any major loss of
functionality? Version 3.4.5 is currently stable, so I am unsure of how it
would help. I can try it and let it soak for a while to see if it helps or
not. The problem is we have many components that tie into Zookeeper and I'm
worried that downgrading may break some of our API calls to it.

Is there a good way of trying to narrow this problem down further?

Thanks again


On Wed, Jan 22, 2014 at 10:15 AM, Jun Rao jun...@gmail.com wrote:

 Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if

 https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
 ?
 is applicable.

 Thanks,

 Jun


 On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. ahmed.ham...@gmail.com wrote:

  I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1, and
  Zookeeper 3.4.5. The activity on this machine isn't massive...I would say
  the Kafka queues get a consistent 1 message every 2-3 seconds, as well as
  occasional spikes, but still nothing large enough to push the limits.
 Both
  Kafka and Zookeeper are running on the same machine.
 
  Occasionally, a rebalance is triggered, which causes our Kafka clients to
  try reconnecting several times, but it ultimately fails with the
 following
  error:
 
 
  04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
  (alarms.topology.updates_host-1383643783747-c7775701_watcher_executor)
  [alarms.topology.updates_host-1383643783747-c7775701], exception
  during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
  org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
  = NoNode for
 
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
  at
  org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
  [zkclient-0.3.jar:0.3]
  at
  org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
  [zkclient-0.3.jar:0.3]
  at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
  [zkclient-0.3.jar:0.3]
  at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
  [zkclient-0.3.jar:0.3]
  at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
  [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
  at
  kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
  [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
  at
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
  [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
  at
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
  [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
  at
 scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
  [scala-library-2.9.2.jar:]
  at
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
  [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
  at
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326)
  [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
  Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
  KeeperErrorCode = NoNode for
 
 
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
  at
  org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
  [zookeeper-3.4.3.jar:3.4.3-1240972]
  at
  org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
  [zookeeper-3.4.3.jar:3.4.3-1240972]
  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
  [zookeeper-3.4.3.jar:3.4.3-1240972]
  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
  [zookeeper-3.4.3.jar:3.4.3-1240972]
  at
 org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
  [zkclient-0.3.jar:0.3]
  at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
  [zkclient-0.3.jar:0.3]
  at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
  [zkclient-0.3.jar:0.3]
  at
  org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
  [zkclient-0.3.jar:0.3]
  ... 9 more
 
 
  Our Kafka consumers are written in Clojure (
  https://github.com/pingles/clj-kafka).
 
  Any 

Re: Kafka rebalancing causes Zookeeper to fail

2014-01-22 Thread Jun Rao
You can find some of the GC settings in
https://cwiki.apache.org/confluence/display/KAFKA/Operations

There were some ZK bugs exposed during session expiration, which were fixed
in 3.3.4. Not sure if 3.4.5 exposes any new issues. The easiest thing is
probably to avoid GC-induced ZK session timeout in the first place or use a
larger session timeout.

Thanks,

Jun


On Wed, Jan 22, 2014 at 8:29 AM, Ahmed H. ahmed.ham...@gmail.com wrote:

 Hello,

 I looked at that, not sure if it is applicable or not at this point. We
 used to have frequent rebalances, but that issue was mitigated by
 increasing the zktimeout on the consumer side. With that said, it may still
 be a problem. I have't collected any metrics concerning rebalances in a
 while. I will certainly take a look at our current GC settings. What are
 typical settings that we should have for GC (I am not sure of what exactly
 I'm looking for)?

 As for downgrading the Zookeeper version, would there be any major loss of
 functionality? Version 3.4.5 is currently stable, so I am unsure of how it
 would help. I can try it and let it soak for a while to see if it helps or
 not. The problem is we have many components that tie into Zookeeper and I'm
 worried that downgrading may break some of our API calls to it.

 Is there a good way of trying to narrow this problem down further?

 Thanks again


 On Wed, Jan 22, 2014 at 10:15 AM, Jun Rao jun...@gmail.com wrote:

  Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
  ?
  is applicable.
 
  Thanks,
 
  Jun
 
 
  On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. ahmed.ham...@gmail.com
 wrote:
 
   I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1,
 and
   Zookeeper 3.4.5. The activity on this machine isn't massive...I would
 say
   the Kafka queues get a consistent 1 message every 2-3 seconds, as well
 as
   occasional spikes, but still nothing large enough to push the limits.
  Both
   Kafka and Zookeeper are running on the same machine.
  
   Occasionally, a rebalance is triggered, which causes our Kafka clients
 to
   try reconnecting several times, but it ultimately fails with the
  following
   error:
  
  
   04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
  
 (alarms.topology.updates_host-1383643783747-c7775701_watcher_executor)
   [alarms.topology.updates_host-1383643783747-c7775701], exception
   during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
   org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
   = NoNode for
  
 
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
   at
   org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
   [zkclient-0.3.jar:0.3]
   at
   org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
   [zkclient-0.3.jar:0.3]
   at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
   [zkclient-0.3.jar:0.3]
   at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
   [zkclient-0.3.jar:0.3]
   at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
   [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
   at
   kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
   [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
   at
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
   [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
   at
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
   [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
   at
  scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
   [scala-library-2.9.2.jar:]
   at
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
   [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
   at
  
 
 kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326)
   [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
   Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
   KeeperErrorCode = NoNode for
  
  
 
 /consumers/alarms.topology.updates/ids/alarms.topology.updates_host-1383643783747-c7775701
   at
   org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
   [zookeeper-3.4.3.jar:3.4.3-1240972]
   at
   org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   [zookeeper-3.4.3.jar:3.4.3-1240972]
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
   [zookeeper-3.4.3.jar:3.4.3-1240972]
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)