Hi Stephen, my comments inline below:
On 05/21/2010 09:31 AM, Stephen Green wrote:
I feel like I'm missing something fairly fundamental here. I'm
building a clustered application that uses ZooKeeper (3.3.1) to store
its configuration information. There are 33 nodes in the cluster
(Amazon EC2 instance, if that matters), and I'm currently using a
single ZooKeeper instance. When a node starts up, it makes a
connection to ZK, sets the data on a few paths and makes an ephemeral
node for itself. I keep the connection open while the node is running
so that I can use watches to find out if a node disappears, but after
the initial setup, the node usually won't write or read anything from
ZK.
My understanding (having had a quick look at the code) is that the
client connection will send a ping every sessionTimeout * 2/3 ms or so
to keep the session alive, but I keep seeing sessions dying. On the
Actually the client sends a ping every 1/3 the timeout, and then looks
for a response before another 1/3 elapses. This allows time to reconnect
to a different server (and still maintain the session) if the current
server were to become unavailable.
client side I'll see something like the following sequence of events:
[05/21/10 15:59:40.753] INFO Initiating client connection,
connectString=zookeeper:2200 sessionTimeout=30000
watcher=com.echonest.cluster.zoocontai...@1eb3319f
[05/21/10 15:59:40.767] INFO Socket connection established to
zookeeper/10.255.9.187:2200, initiating session
[05/21/10 15:59:40.787] INFO Session establishment complete on server
zookeeper/10.255.9.187:2200, sessionid = 0x128bb7b828d004c, negotiated
timeout = 30000
Ok, this (^^^) says that the timeout is set to 30sec.
[05/21/10 16:13:03.729] INFO Client session timed out, have not heard
from server in 33766ms for sessionid 0x128bb7b828d004c, closing socket
connection and attempting reconnect
[05/21/10 16:13:19.268] INFO Initiating client connection,
connectString=zookeeper:2200 sessionTimeout=30000
watcher=com.echonest.cluster.zoocontai...@1eb3319f
[05/21/10 16:14:12.326] INFO Client session timed out, have not heard
from server in 53058ms for sessionid 0x128bb7b828d004c, closing socket
connection and attempting reconnect
This (^^^) is very suspicious, in particular "have not heard from server
in 53058ms". This means that the client heartbeat code didn't notice
that the heartbeat was exceeded for 53 seconds! This should never
happen, the client does a "select" with a timeout of 1/3 the session
timeout (10sec here).
The fact that the select is taking 43 addl seconds (53-10sec select
timeout) tells me that your client jvm is not allowing the heartbeat
thread to run. The most common reason for this is GC. Is your client
application very memory intensive? Heavy on GC? You should turn on your
GC logging and review the output after reproducing this issue (turning
on CMS/incremental GC mode usually resolves this issue, but you should
verify first). What we typically see here is that the client JVM is
running GC for very long periods of time, this blocks all the threads,
and as a result the heartbeat is not sent by the client!
As you are running in a virtualized environment this could also be a
factor (it's def an issue from a GC perspective). But I suspect that gc
is the issue here, look at that first.
See this page for some common issues users have faced in the past:
http://bit.ly/5WwS44
If I'm reading this correctly, the connection gets set up and then the
server experiences an error trying to read from the client, so it
closes the connection. It's not clear if this causes the session
timeout or vice-versa (these systems are both running ntp, but I doubt
that we can count on interleaving those log times correctly.)
Yes, this is due to the client not getting the heartbeat, so it will
close the connection and attempt to reestablish.
I started out with a session timeout of 10,000ms, but as you can see,
I have the same problem at 30,000ms.
You may not need to use this after you resolve the GC issue.
Do I have a fundamental misunderstanding? What else should I do to
figure out what's going on here?
As I suggested above, give GC logging a try. I found 'gchisto' a very
useful tool for reviewing the resulting log files.
http://sysadminsjourney.com/content/2008/09/15/profile-your-java-gc-performance-gchisto
Regards,
Patrick