Re: Timeouts and ping handling

2012-01-19 Thread Manosiz Bhattacharyya
Thanks, Manosiz. On Thu, Jan 19, 2012 at 11:31 AM, Patrick Hunt wrote: > See "preAllocSize" > > http://zookeeper.apache.org/doc/r3.4.2/zookeeperAdmin.html#sc_advancedConfiguration > > On Thu, Jan 19, 2012 at 10:49 AM, Manosiz Bhattacharyya > wrote: > > Thanks a lot for this info. A pointer in t

Re: Timeouts and ping handling

2012-01-19 Thread Patrick Hunt
See "preAllocSize" http://zookeeper.apache.org/doc/r3.4.2/zookeeperAdmin.html#sc_advancedConfiguration On Thu, Jan 19, 2012 at 10:49 AM, Manosiz Bhattacharyya wrote: > Thanks a lot for this info. A pointer in the code to where you do this > preallocation or a flag to disable this would be very be

Re: Timeouts and ping handling

2012-01-19 Thread Manosiz Bhattacharyya
Thanks a lot for this info. A pointer in the code to where you do this preallocation or a flag to disable this would be very beneficial. On Thu, Jan 19, 2012 at 10:18 AM, Ted Dunning wrote: > ZK does pretty much entirely sequential I/O. > > One thing that it does which might be very, very bad fo

Re: Timeouts and ping handling

2012-01-19 Thread Manosiz Bhattacharyya
We are using the zookeeper c client version 3.3.4 the same as the server. We use libptread-2.10.1.so, and no special time slicing in user code. Will let you know what we find. Thanks, Manosiz. On Thu, Jan 19, 2012 at 10:09 AM, Patrick Hunt wrote: > On Thu, Jan 19, 2012 at 9:31 AM, Manosiz Bhatt

Re: Timeouts and ping handling

2012-01-19 Thread Ted Dunning
ZK does pretty much entirely sequential I/O. One thing that it does which might be very, very bad for SSD is that it pre-allocates disk extents in the log by writing a bunch of zeros. This is to avoid directory updates as the log is written, but it doubles the load on the SSD. On Thu, Jan 19, 20

Re: Timeouts and ping handling

2012-01-19 Thread Patrick Hunt
On Thu, Jan 19, 2012 at 9:31 AM, Manosiz Bhattacharyya wrote: > I do not think that there is a problem with the queue size. I guess the > problem is more with latency when the Fusion I/O goes in for a GC. We are > enabling stats on the Zookeeper and the fusion I/O to be more precise. Does > Zookee

Re: Timeouts and ping handling

2012-01-19 Thread Manosiz Bhattacharyya
I do not think that there is a problem with the queue size. I guess the problem is more with latency when the Fusion I/O goes in for a GC. We are enabling stats on the Zookeeper and the fusion I/O to be more precise. Does Zookeeper typically do only sequential I/O, or does it do some random too. We

Re: Timeouts and ping handling

2012-01-18 Thread Ted Dunning
If you aren't pushing much data through ZK, there is almost no way that the request queue can fill up without the log or snapshot disks being slow. See what happens if you put the log into a real disk or (heaven help us) onto a tmpfs partition. On Thu, Jan 19, 2012 at 2:18 AM, Manosiz Bhattachary

Re: Timeouts and ping handling

2012-01-18 Thread Manosiz Bhattacharyya
I will do as you mention. We are using the async API's throughout. Also we do not write too much data into Zookeeper. We just use it for leadership elections and health monitoring, which is why we see the timeouts typically on idle zookeeper connections. The reason why we want the sessions to be

Re: Timeouts and ping handling

2012-01-18 Thread Patrick Hunt
On Wed, Jan 18, 2012 at 4:47 PM, Manosiz Bhattacharyya wrote: > Thanks Patrick for your answer, No problem. > Actually we are in a virtualized environment, we have a FIO disk for > transactional logs. It does have some latency sometimes during FIO garbage > collection. We know this could be the

Re: Timeouts and ping handling

2012-01-18 Thread Manosiz Bhattacharyya
I was not indicating that we do not detect the situation of a stuck server. A watchdog of some sort keeping track of queue changes could also suffice. Thanks for you input. I guess we will try to work out with the increasing the timeout. -- Manosiz. On Wed, Jan 18, 2012 at 4:54 PM, Ted Dunning w

Re: Timeouts and ping handling

2012-01-18 Thread Manosiz Bhattacharyya
Yes. On Wed, Jan 18, 2012 at 5:15 PM, Ted Dunning wrote: > Does FIO stand for Fusion I/O? > > On Thu, Jan 19, 2012 at 12:47 AM, Manosiz Bhattacharyya > wrote: > > > ... we have a FIO disk >

Re: Timeouts and ping handling

2012-01-18 Thread Patrick Hunt
On Wed, Jan 18, 2012 at 3:21 PM, Camille Fournier wrote: > Duh, I knew there was something I was forgetting. You can't process the > session timeout faster than the server can process the full pipeline, so > making pings come back faster just means you will have a false sense of > liveness for yo

Re: Timeouts and ping handling

2012-01-18 Thread Ted Dunning
Does FIO stand for Fusion I/O? On Thu, Jan 19, 2012 at 12:47 AM, Manosiz Bhattacharyya wrote: > ... we have a FIO disk

Re: Timeouts and ping handling

2012-01-18 Thread Ted Dunning
That really depends on whether you think that a stuck server is a problem. The primary indication of that is a full queue and you are suggesting that we not detect this situation. It isn't a matter of keeping the session alive ... it is a matter of whether or not we can guarantee that things are

Re: Timeouts and ping handling

2012-01-18 Thread Manosiz Bhattacharyya
Thanks Patrick for your answer, Actually we are in a virtualized environment, we have a FIO disk for transactional logs. It does have some latency sometimes during FIO garbage collection. We know this could be the potential issue, but was trying to workaround that. We were trying to qualify the r

Re: Timeouts and ping handling

2012-01-18 Thread Camille Fournier
Duh, I knew there was something I was forgetting. You can't process the session timeout faster than the server can process the full pipeline, so making pings come back faster just means you will have a false sense of liveness for your services. The question about why the leaders and followers hand

Re: Timeouts and ping handling

2012-01-18 Thread Patrick Hunt
Next up is disk. (I'm assuming you're not running in a virtualized environment, correct?) You have a dedicated log device for the transactional logs? Check your disk latency and make sure that's not holding up the writes. What does "stat" show you wrt latency in general and at the time you see the

Re: Timeouts and ping handling

2012-01-18 Thread Manosiz Bhattacharyya
Thanks a lot for your response. We are running the c-client, as all our components are C++ applications. We are tracing GC on the server side, but did not see much activity there. We did tune GC. Our gc flags include the following JVMFLAGS="$JVMFLAGS -XX:+UseParNewGC" JVMFLAGS="$JVMFLAGS -XX:+UseC

Re: Timeouts and ping handling

2012-01-18 Thread Ted Dunning
Monitor GC on *both* ZK server and client. Either side can easily cause a 1-2 second delay if mal-configured. On Wed, Jan 18, 2012 at 10:34 PM, Patrick Hunt wrote: > I suspect that you are being effected by GC pauses. Have you tuned the > GC at all or just the defaults? Monitor the GC in the VM

Re: Timeouts and ping handling

2012-01-18 Thread Patrick Hunt
On Wed, Jan 18, 2012 at 2:03 PM, Camille Fournier wrote: > I think it can be done. Looking through the code, it seems like it should > be safe modulo some stats that are set in the FinalRequestProcessor that > may be less useful. > Turning around HBs at the head end of the server is a bad idea. I

Re: Timeouts and ping handling

2012-01-18 Thread Patrick Hunt
Forgot to mention, use "stat" and some of the other 4letterwords to get an idea what your request latency looks like across servers. In particular you can see the "max latency" and correlate that with what you're seeing on the clients & gc (etc...) activity. Patrick On Wed, Jan 18, 2012 at 2:34 P

Re: Timeouts and ping handling

2012-01-18 Thread Patrick Hunt
5 seconds is fairly low. HBs are sent by the client every 1/3 the timeout, with expectation that it will get a response in another 1/3 the timeout. if not the client session will time out. As a result, any blip of 1.5 sec or more btw the client and server could cause this to happen. Network latenc

Re: Timeouts and ping handling

2012-01-18 Thread Camille Fournier
I think it can be done. Looking through the code, it seems like it should be safe modulo some stats that are set in the FinalRequestProcessor that may be less useful. A question for the other zookeeper devs out there, is there a reason that we handle read-only operations in the first processor dif