Matt Jones created KAFKA-462:
--------------------------------
Summary: ZK thread crashing doesn't bring down the broker (and
doesn't come back up).
Key: KAFKA-462
URL: https://issues.apache.org/jira/browse/KAFKA-462
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 0.7
Reporter: Matt Jones
I think the simplest explanation is the traceback. The broker had been up
starting at 2012-07-31 18:45:42,951 (based upon the 'Starting Kafka server' log
entry), and the error was fixed with a restart of the broker at 2012-08-14
20:59:41,581.
It looks like zookeeper thread crashed, but the broker kept operating as usual.
The expected behavior would be that the zookeeper thread crashing would cause
the whole broker to crash, or the zookeeper thread would start itself back up.
[2012-08-08 01:25:13,398] 624270894 [main-SendThread(zookeeper001:2181)] INFO
org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
from server in 8749ms for sessionid 0x138e4edc04c1e50, closing socket
connection and attempting reconnect
[2012-08-08 01:25:15,136] 624272632 [main-EventThread] INFO
org.I0Itec.zkclient.ZkClient - zookeeper state changed (Disconnected)
[2012-08-08 01:25:15,702] 624273198 [main-SendThread(zookeeper001:2181)] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
zookeeper003/10.125.95.193:2181
[2012-08-08 01:25:15,704] 624273200 [main-SendThread(zookeeper003:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
zookeeper003/10.125.95.193:2181, initiating session
[2012-08-08 01:25:15,709] 624273205 [main-EventThread] INFO
org.I0Itec.zkclient.ZkClient - zookeeper state changed (Expired)
[2012-08-08 01:25:15,709] 624273205 [main-EventThread] INFO
org.apache.zookeeper.ZooKeeper - Initiating client connection,
connectString=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181
sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@26d66426
[2012-08-08 01:25:21,514] 624279010 [main-SendThread(zookeeper003:2181)] INFO
org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service,
session 0x138e4edc04c1e50 has expired, closing socket connection
[2012-08-08 01:25:47,135] 624304631 [main-EventThread] ERROR
org.apache.zookeeper.ClientCnxn - Error while calling watcher
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
Caused by: org.I0Itec.zkclient.exception.ZkException: Unable to connect to
zookeeper001:2181,zookeeper002:2181,zookeeper003:2181
Caused by: java.net.UnknownHostException: zookeeper001
at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:386)
at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:331)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:377)
[2012-08-08 01:25:48,620] 624306116 [main-EventThread] INFO
org.apache.zookeeper.ClientCnxn - EventThread shut down
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira