Hi,
I am using zookeeper 3.4.6
I have a spark cluster configured with HA. Once per 1-2 days, the active spark 
master is shutting down with a message
15/09/23 18:58:18 INFO zookeeper.ClientCnxn: Unable to read additional data 
from server sessionid 0x34ffa68dbd10021, likely server has closed socket, 
closing socket connection and attempting reconnect
15/09/23 18:58:18 INFO state.ConnectionStateManager: State change: SUSPENDED
15/09/23 18:58:18 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
leadership
15/09/23 18:58:18 ERROR master.Master: Leadership has been revoked -- master 
shutting down.
15/09/23 18:58:18 INFO util.Utils: Shutdown hook called

I don’t have the zookeeper logs from the same period, but the logs are full of 
the these messages 
2015-09-24 05:07:42,228 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.0.8.4:34705
2015-09-24 05:07:42,229 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection 
request from old client /10.0.8.4:34705; will be dropped if server is in r-o 
mode
2015-09-24 05:07:42,229 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client 
attempting to establish new session at /10.0.8.4:34705
2015-09-24 05:07:42,292 [myid:1] - INFO  
[CommitProcessor:1:ZooKeeperServer@617] - Established session 0x14ffd3670130030 
with negotiated timeout 20001 for client /10.0.8.4:34705
2015-09-24 05:07:42,302 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x14ffd3670130030, likely client has closed socket
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-09-24 05:07:42,303 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /10.0.8.4:34705 which had sessionid 0x14ffd3670130030
2015-09-24 05:07:42,314 [myid:1] - ERROR [CommitProcessor:1:NIOServerCnxn@178] 
- Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at 
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at 
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
        at 
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
        at 
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
2015-09-24 05:07:42,334 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.0.8.4:34707
2015-09-24 05:07:42,334 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection 
request from old client /10.0.8.4:34707; will be dropped if server is in r-o 
mode
2015-09-24 05:07:42,335 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client 
attempting to establish new session at /10.0.8.4:34707
2015-09-24 05:07:42,357 [myid:1] - INFO  
[CommitProcessor:1:ZooKeeperServer@617] - Established session 0x14ffd3670130031 
with negotiated timeout 20001 for client /10.0.8.4:34707
2015-09-24 05:07:42,364 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x14ffd3670130031, likely client has closed socket
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-09-24 05:07:42,365 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /10.0.8.4:34707 which had sessionid 0x14ffd3670130031
2015-09-24 05:07:42,376 [myid:1] - ERROR [CommitProcessor:1:NIOServerCnxn@178] 
- Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
        at 
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
        at 
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
        at 
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404)
        at 
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)

Also there are 
2015-09-24 06:29:54,459 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] - 
Shutting down
2015-09-24 06:29:54,459 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@441] - shutting down
2015-09-24 06:29:54,459 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@105] - 
Shutting down
2015-09-24 06:29:54,459 [myid:1] - INFO  
[FollowerRequestProcessor:1:FollowerRequestProcessor@95] - 
FollowerRequestProcessor exited loop!
2015-09-24 06:29:54,460 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - Shutting down
2015-09-24 06:29:54,464 [myid:1] - INFO  
[CommitProcessor:1:CommitProcessor@150] - CommitProcessor exited loop!
2015-09-24 06:29:54,465 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] - shutdown 
of request processor complete
2015-09-24 06:29:54,466 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@209] - Shutting 
down
2015-09-24 06:29:54,466 [myid:1] - INFO  
[SyncThread:1:SyncRequestProcessor@187] - SyncRequestProcessor exited!
2015-09-24 06:29:54,466 [myid:1] - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@714] - LOOKING
2015-09-24 06:29:54,584 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.0.8.58:36137
2015-09-24 06:29:54,584 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2015-09-24 06:29:54,584 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /10.0.8.58:36137 (no session established for client)
2015-09-24 06:29:54,679 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.0.8.57:57410
2015-09-24 06:29:54,680 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2015-09-24 06:29:54,680 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /10.0.8.57:57410 (no session established for client)

I also observed that hadoop-zkfc restarts very frequently.
Any ideas what could be wrong?

Thanks.

Reply via email to