Take a look at this section to start:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_commonProblems
What type of monitoring are you doing on your cluster? You could monitor
at both the host and at the java (jmx) level. That will give you some
insight on where to look;
Patrick,
Thanks enormously.
This hasn't helped yet, but that is just because it was a very large bite of
the apple. Once I digest it, I can tell that it will be very helpful.
I did have a chance to look at the stat output and maximum latency was
300ms. How that connects with what you are
Well that's good - 300ms max latency means that the server can round
trip any requests pretty quickly. It would lead me to look at the client
VMs or (intermittent) network problems...
Keep in mind though that's one of your servers (unless you are saying
you checked all X of the servers in the
Hi Ted,
Fellow user coming from HBase. We were recently seeing lots of
SessionExpired events as well. Check out this mail thread:
http://markmail.org/search/?q=SessionExpired#query:SessionExpired+page:1+mid:gt4c2kn4n4f5s5kw+state:results
Perhaps this might have something to do with what you're
Very good pointer. Thanks.
Are you still having your problems?
On Tue, Apr 14, 2009 at 6:09 PM, Nitay nit...@gmail.com wrote:
Hi Ted,
Fellow user coming from HBase. We were recently seeing lots of
SessionExpired events as well. Check out this mail thread:
Yes, we are. We currently don't handle SessionExpired very well at all in
HBase. There are two things going on in parallel to fix it:
1) Reinitialize the ZooKeeper handler (and everything else that depends on
it) on the node in question when a SessionExpired event occurs.
2) Reduce the number of