zookeeper.session.timeout.ms  in consumer config.
Thanks,
Jun

On Thu, Jan 23, 2014 at 11:24 AM, Ahmed H. <ahmed.ham...@gmail.com> wrote:

> When you say "use a larger session timeout", which session timeout do you
> refer to? Is it the zookeeper session timeout variable that we define when
> creating a Kafka consumer? Or is there a different session timeout?
>
> As for downgrading, that is currently not an option for the time being, so
> I will have to have some better debugging tools to pinpoint the cause.
>
> Thanks
>
>
> On Wed, Jan 22, 2014 at 11:44 PM, Jun Rao <jun...@gmail.com> wrote:
>
> > You can find some of the GC settings in
> > https://cwiki.apache.org/confluence/display/KAFKA/Operations
> >
> > There were some ZK bugs exposed during session expiration, which were
> fixed
> > in 3.3.4. Not sure if 3.4.5 exposes any new issues. The easiest thing is
> > probably to avoid GC-induced ZK session timeout in the first place or
> use a
> > larger session timeout.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Jan 22, 2014 at 8:29 AM, Ahmed H. <ahmed.ham...@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > I looked at that, not sure if it is applicable or not at this point. We
> > > used to have frequent rebalances, but that issue was mitigated by
> > > increasing the zktimeout on the consumer side. With that said, it may
> > still
> > > be a problem. I have't collected any metrics concerning rebalances in a
> > > while. I will certainly take a look at our current GC settings. What
> are
> > > typical settings that we should have for GC (I am not sure of what
> > exactly
> > > I'm looking for)?
> > >
> > > As for downgrading the Zookeeper version, would there be any major loss
> > of
> > > functionality? Version 3.4.5 is currently stable, so I am unsure of how
> > it
> > > would help. I can try it and let it soak for a while to see if it helps
> > or
> > > not. The problem is we have many components that tie into Zookeeper and
> > I'm
> > > worried that downgrading may break some of our API calls to it.
> > >
> > > Is there a good way of trying to narrow this problem down further?
> > >
> > > Thanks again
> > >
> > >
> > > On Wed, Jan 22, 2014 at 10:15 AM, Jun Rao <jun...@gmail.com> wrote:
> > >
> > > > Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > > ?
> > > > is applicable.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. <ahmed.ham...@gmail.com>
> > > wrote:
> > > >
> > > > > I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta
> 1,
> > > and
> > > > > Zookeeper 3.4.5. The activity on this machine isn't massive...I
> would
> > > say
> > > > > the Kafka queues get a consistent 1 message every 2-3 seconds, as
> > well
> > > as
> > > > > occasional spikes, but still nothing large enough to push the
> limits.
> > > > Both
> > > > > Kafka and Zookeeper are running on the same machine.
> > > > >
> > > > > Occasionally, a rebalance is triggered, which causes our Kafka
> > clients
> > > to
> > > > > try reconnecting several times, but it ultimately fails with the
> > > > following
> > > > > error:
> > > > >
> > > > >
> > > > > 04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (alarms.topology.updates_<host>-1383643783747-c7775701_watcher_executor)
> > > > > [alarms.topology.updates_<host>-1383643783747-c7775701], exception
> > > > > during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode
> > > > > = NoNode for
> > > > >
> > > >
> > >
> >
> /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701
> > > > >         at
> > > > >
> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at
> > > > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >         at
> > > > > kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >         at
> > > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >         at
> > > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >         at
> > > > scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
> > > > > [scala-library-2.9.2.jar:]
> > > > >         at
> > > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >         at
> > > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > > > > KeeperErrorCode = NoNode for
> > > > >
> > > > >
> > > >
> > >
> >
> /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701
> > > > >         at
> > > > >
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > >         at
> > > > >
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > >         at
> > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > >         at
> > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > >         at
> > > > org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         at
> > > > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >         ... 9 more
> > > > >
> > > > >
> > > > > Our Kafka consumers are written in Clojure (
> > > > > https://github.com/pingles/clj-kafka).
> > > > >
> > > > > Any ideas on what can cause such behaviour? The rebalances
> themselves
> > > > > happen sporadically, but when they do, they sometimes fail and an
> > error
> > > > > like the one above is shown. I'm not sure if this is a Kafka or
> > > Zookeeper
> > > > > problem at this point, but any help would be appreciated.
> > > > >
> > > > > Thanks
> > > > >
> > > >
> > >
> >
>

Reply via email to