Thanks Ben, That makes sense.
As for why this timeout keeps on happening ... I'm wondering if I'm running into a swapping issue because ZooKeeper doesn't have a max heap size specified ... and this host has 10GB of RAM ... so the zookeeper process is running currently with 4980MB of RAM with 104MB resident (according to top) ... 4980MB is a bit excessive as I'm only using zookeeper to support replicated leveldb in activemq. How could I tell if swapping is causing my disconnects? Also, is anyone familiar with using zookeeper to support replicated leveldb in activemq? If so, is 1GB of heap space enough for zookeeper to support that? That is all we're using this zookeeper for, so it seems like ~5GB of heap might be a bit excessive. For comparison, we've been running this setup in another datacenter where zookeeper hosts only have 2GB of RAM and it ran fine there ... but those hosts aren't running anymore and since we didn't specify the JVM heap size I'm not sure how much RAM zookeeper was actually using ... but I'm guessing it was somewhere near 1GB (1/2 of RAM)? adam -----Original Message----- From: Benjamin Reed [mailto:[email protected]] Sent: Wednesday, November 02, 2016 3:02 PM To: [email protected] Subject: Re: zookeeper client seems to timeout earlier than it should clients need to make sure they move off of a dead server on to a new one to keep their connection alive, so generally if the client hasn't heard from the server in 2/3 * sessionTimeout it will try to connect to someone else. if it waited the whole 4 seconds, when connected to an active server it would be pronounced dead on arrival. ben On Wed, Nov 2, 2016 at 5:11 PM, Whitney, Adam <[email protected]> wrote: > (Sorry if this is a repost … I got a strange response to my original > email so I’m not sure if it went through or not) > > I have a zookeeper cluster with 3 nodes and tick time set to 2s > > When a client connects to the cluster I see a log entry like this: > > INFO | Session establishment complete on server XXX, sessionid = XXX, > negotiated timeout = 4000 | org.apache.zookeeper.ClientCnxn | > main-SendThread(XXX:2181) > > Notice the "negotiated timeout = 4000" > > But about once a day I see a log entry like this: > > INFO | Client session timed out, have not heard from server in 2953ms > for sessionid XXX, closing socket connection and attempting reconnect > | org.apache.zookeeper.ClientCnxn | main-SendThread(XXX:2181) > > Why would the client (apparently) timeout the session after only 2953ms if > the negotiated timeout was 4000ms? >
