[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683181#action_12683181
 ] 

Patrick Hunt commented on ZOOKEEPER-344:
----------------------------------------

Hi Bryan, you might also try looking at some of the statistics using the "stat" 
command:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_zkCommands
this will give you insight on the min/max/avg latency of requests. You could 
also use JMX if that works for you:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperJMX.html

What is the timeout value you are using for your ZK clients? If your max 
latency is exceeding your client
timeouts then you will definitely see expirations.

Secondly review this section, specifically related to tranaction log placement 
and jdk memory (swapping) issues:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_commonProblems
Either of these issues can cause performance to dip, and latencies to increase.

This information, along with a bit more detail on your benchmark would help 
you/us identify what's causing
these issues. Re your benchmark, how many operations/sec are you running? 
What's the read/write split?

Your zk server is a single quad-core x86_64 cpu, correct?

> doIO in NioServerCnxn: Exception causing close of session : cause is "read 
> error"
> ---------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-344
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-344
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: java client, server
>    Affects Versions: 3.1.0
>         Environment: jdk1.6.0_07
> Linux blade2 2.6.27.7-134.fc10.x86_64 #1 SMP Mon Dec 1 22:21:35 EST 2008 
> x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: bryan thompson
>             Fix For: 3.2.0
>
>
> I have been having a problem with zookeeper 3.0.1 and now with 3.1.0 where I 
> see a lot of expired sessions.  I am using a 16 node cluster which is all on 
> the same local network.  There is a single zookeeper instance (these are 
> benchmarking runs).
> The problem appears to be correlated with either run time or system load.\
> Personally I think that it is system load because I have session session 
> expired events under a Windows platform running zookeeper and the application 
> (i.e., everthing is local) when the application load suddenly spikes.  To me 
> this suggests that the client is not able to renew (ping) the zookeeper 
> service in a timely manner and is expired.  But the log messages below with 
> the "read error" suggest that maybe there is something else going on?
> Zookeeper Configuration
> #Wed Mar 18 12:41:05 GMT-05:00 2009
> clientPort=2181
> dataDir=/var/bigdata/benchmark/zookeeper/1
> syncLimit=2
> dataLogDir=/var/bigdata/benchmark/zookeeper/1
> tickTime=2000
> Some representative log messages are below.
> Client side messages (from our app)
> ERROR [main-EventThread] 
> com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
> 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
> New state: Expired : 
> zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1160/locknode
> ERROR [main-EventThread] 
> com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
> 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
> New state: Expired : 
> zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1356/locknode
> Server side messages:
>  WARN [NIOServerCxn.Factory:2181] 
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
> 2009-03-18 13:06:57,252 - Exception causing close of session 
> 0x1201aac14300022 due to java.io.IOException: Read error
>  WARN [NIOServerCxn.Factory:2181] 
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
> 2009-03-18 13:06:58,198 - Exception causing close of session 
> 0x1201aac1430000f due to java.io.IOException: Read error

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to