not really - it happens occasionally - every few days :( I believe it is somewhat connected with our network environment which suffers from some packet loss which leads to connection timeouts.
I can switch on some more logging if you can lead me which categories are worth to enable DEBUG for them. Regards, Łukasz Osipiuk On Tue, Mar 16, 2010 at 16:35, Benjamin Reed <br...@yahoo-inc.com> wrote: > weird, this does sound like a bug. do you have a reliable way of reproducing > the problem? > > thanx > ben > > On 03/16/2010 08:27 AM, Łukasz Osipiuk wrote: >> >> nope. >> >> I always pass 0 as clientid. >> >> Łukasz >> >> On Tue, Mar 16, 2010 at 16:20, Benjamin Reed<br...@yahoo-inc.com> wrote: >> >>> >>> do you ever use zookeeper_init() with the clientid field set to something >>> other than null? >>> >>> ben >>> >>> On 03/16/2010 07:43 AM, Łukasz Osipiuk wrote: >>> >>>> >>>> Hi everyone! >>>> >>>> I am writing to this group because recently we are getting some >>>> strange errors with our production zookeeper setup. >>>> >>>> From time to time we are observing that our client application (C++ >>>> based) disconnects from zookeeper (session state is changed to 1) and >>>> reconnects (state changed to 3). >>>> This itself is not a problem - usually application continues to run >>>> without problems after reconnect. >>>> But from time to time after above happens all subsequent operations >>>> start to return ZSESSIONMOVED error. To make it work again we have to >>>> restart application (which creates new zookeeper session). >>>> >>>> I noticed that in 3.2.0 introduced a bug >>>> http://issues.apache.org/jira/browse/ZOOKEEPER-449 but we are using >>>> zookeeper v. 3.2.2. >>>> I just noticed that app at compile time used 3.2.0 library but patches >>>> fixing bug 449 did not touch C client lib so I believe that our >>>> problems are not >>>> related with that. >>>> >>>> In zookeeper logs at moment which initiated the problem with client >>>> application I have >>>> >>>> node1: >>>> 2010-03-16 14:21:43,510 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@607] - Connected to >>>> /10.1.112.61:37197 lastZxid 42992576502 >>>> 2010-03-16 14:21:43,510 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@636] - Renewing session >>>> 0x324dcc1ba580085 >>>> 2010-03-16 14:21:49,443 - INFO >>>> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:nioserverc...@992] - Finished init >>>> of 0x324dcc1ba580085 valid:true >>>> 2010-03-16 14:21:49,443 - WARN >>>> [NIOServerCxn.Factory:2181:nioserverc...@518] - Exception causing >>>> close of session 0x324dcc1ba580085 due to java.io.IOException: Read >>>> error >>>> 2010-03-16 14:21:49,444 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@857] - closing >>>> session:0x324dcc1ba580085 NIOServerCnxn: >>>> java.nio.channels.SocketChannel[connected local=/10.1.112.62:2181 >>>> remote=/10.1.112.61:37197] >>>> >>>> node2: >>>> 2010-03-16 14:21:40,580 - WARN >>>> [NIOServerCxn.Factory:2181:nioserverc...@494] - Exception causing >>>> close of session 0x324dcc1ba580085 due to java.io.IOException: Read >>>> error >>>> 2010-03-16 14:21:40,581 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@833] - closing >>>> session:0x324dcc1ba580085 NIOServerCnxn: >>>> java.nio.channels.SocketChannel[connected local=/10.1.112.63:2181 >>>> remote=/10.1.112.61:60693] >>>> 2010-03-16 14:21:46,839 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@583] - Connected to >>>> /10.1.112.61:48336 lastZxid 42992576502 >>>> 2010-03-16 14:21:46,839 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@612] - Renewing session >>>> 0x324dcc1ba580085 >>>> 2010-03-16 14:21:49,439 - INFO >>>> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:nioserverc...@964] - Finished init >>>> of 0x324dcc1ba580085 valid:true >>>> >>>> node3: >>>> 2010-03-16 02:14:48,961 - WARN >>>> [NIOServerCxn.Factory:2181:nioserverc...@494] - Exception causing >>>> close of session 0x324dcc1ba580085 due to java.io.IOException: Read >>>> error >>>> 2010-03-16 02:14:48,962 - INFO >>>> [NIOServerCxn.Factory:2181:nioserverc...@833] - closing >>>> session:0x324dcc1ba580085 NIOServerCnxn: >>>> java.nio.channels.SocketChannel[connected local=/10.1.112.64:2181 >>>> remote=/10.1.112.61:57309] >>>> >>>> and then lots of entries like this >>>> 2010-03-16 02:14:54,696 - WARN >>>> [ProcessThread:-1:preprequestproces...@402] - Got exception when >>>> processing sessionid:0x324dcc1ba580085 type:create cxid:0x4b9e9e49 >>>> zxid:0xfffffffffffffffe txntype:unknown >>>> /locks/9871253/lock-8589943989- >>>> org.apache.zookeeper.KeeperException$SessionMovedException: >>>> KeeperErrorCode = Session moved >>>> at >>>> >>>> org.apache.zookeeper.server.SessionTrackerImpl.checkSession(SessionTrackerImpl.java:231) >>>> at >>>> >>>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:211) >>>> at >>>> >>>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114) >>>> 2010-03-16 14:22:06,428 - WARN >>>> [ProcessThread:-1:preprequestproces...@402] - Got exception when >>>> processing sessionid:0x324dcc1ba580085 type:create cxid:0x4b9f6603 >>>> zxid:0xfffffffffffffffe txntype:unknown >>>> /locks/1665960/lock-8589961006- >>>> org.apache.zookeeper.KeeperException$SessionMovedException: >>>> KeeperErrorCode = Session moved >>>> at >>>> >>>> org.apache.zookeeper.server.SessionTrackerImpl.checkSession(SessionTrackerImpl.java:231) >>>> at >>>> >>>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:211) >>>> at >>>> >>>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114) >>>> >>>> >>>> To workaround disconnections I am going to increase session timeout >>>> from 5 to 15 seconds but event if it helps at all it is just a >>>> workaround. >>>> >>>> Do you have an idea where is the source of my problem. >>>> >>>> Regards, Łukasz Osipiuk >>>> >>>> >>>> >>>> >>>> >>> >>> >> >> >> > > -- -- Łukasz Osipiuk mailto:luk...@osipiuk.net