Liu Shaohui created ZOOKEEPER-2092:
--------------------------------------

             Summary: A zk instance can not be connected for ZooKeeperServer is 
not running
                 Key: ZOOKEEPER-2092
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2092
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.4.4
            Reporter: Liu Shaohui


In our 5 node zk cluster, we found a zk node always can not be connected. From 
the stack we found the ZooKeeperServer hung at waiting the server to be 
running. But the node is running normally and synced with the leader.

{code}
$ ./zkCli.sh -server 10.101.10.67:11000 ls /
2014-11-27 20:57:11,843 [myid:] - WARN  
[main-SendThread(lg-com-master02.bj:11000):ClientCnxn$SendThread@1089] - 
Session 0x0 for server lg-com-master02.bj/10.101.10.67:11000, unexpected error, 
closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
        at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:353)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
Exception in thread "main" 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1469)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1497)
        at 
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:726)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:594)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:355)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:283)
{code}

ZooKeeperServer stack
{code}
"NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11000" daemon prio=10 
tid=0x00007f60143f7800 nid=0x31fd in Object.wait() [0x00007f5fd4678000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at 
org.apache.zookeeper.server.ZooKeeperServer.submitRequest(ZooKeeperServer.java:634)
        - locked <0x00000007602756a0> (a 
org.apache.zookeeper.server.quorum.FollowerZooKeeperServer)
        at 
org.apache.zookeeper.server.ZooKeeperServer.submitRequest(ZooKeeperServer.java:626)
        at 
org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:525)
        at 
org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:841)
        at 
org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:410)
        at 
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:200)
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:236)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:662)
{code}

Any suggestions about this problem? Thanks.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to