I will attach the logs inline this time. I looked at the metrics and can't find any issue wrt network connectivity or cpu/disk/memory issues. We also moved the data directory to /run memfs to avoid fsync latencies as we don't need persistency. Could you pls. take a look at the logs and see if you can spot something that gives an insight to the underlying issue.
Thanks, Anand. *From node1:* 2017-09-21 02:57:13,164 [myid:1] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. Removing file: Sep 21, 2017 1:17:24 AM /run/zookeeper/version-2/log.50006f026 Removing file: Sep 21, 2017 12:47:17 AM /run/zookeeper/version-2/log.50006ecf6 Removing file: Sep 21, 2017 1:17:24 AM /run/zookeeper/version-2/snapshot.50006f375 Removing file: Sep 21, 2017 12:47:17 AM /run/zookeeper/version-2/snapshot.50006f024 2017-09-21 02:57:13,164 [myid:1] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2017-09-21 02:59:32,298 [myid:1] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x15e5304d29b0046 2017-09-21 02:59:34,271 [myid:1] - WARN [QuorumPeer[myid=1]/127.0.0.1:5002 :Follower@89] - Exception when following the leader java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:153) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786) 2017-09-21 02:59:34,271 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :Follower@166] - shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790) 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :FollowerZooKeeperServer@139] - Shutting down 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :ZooKeeperServer@441] - shutting down 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :FollowerRequestProcessor@105] - Shutting down 2017-09-21 02:59:34,272 [myid:1] - INFO [FollowerRequestProcessor:1:FollowerRequestProcessor@95] - FollowerRequestProcessor exited loop! 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :CommitProcessor@181] - Shutting down 2017-09-21 02:59:34,272 [myid:1] - INFO [CommitProcessor:1:CommitProcessor@150] - CommitProcessor exited loop! 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :FinalRequestProcessor@415] - shutdown of request processor complete 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :SyncRequestProcessor@209] - Shutting down 2017-09-21 02:59:34,272 [myid:1] - INFO [SyncThread:1:SyncRequestProcessor@187] - SyncRequestProcessor exited! 2017-09-21 02:59:34,272 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :QuorumPeer@714] - LOOKING 2017-09-21 02:59:34,273 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fba3 2017-09-21 02:59:34,275 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 2017-09-21 02:59:34,275 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 2017-09-21 02:59:34,277 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :FastLeaderElection@815] - New election. My id = 1, proposed zxid=0x50006fec3 2017-09-21 02:59:34,277 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 02:59:34,278 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 2 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 02:59:34,278 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 02:59:34,478 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :QuorumPeer@784] - FOLLOWING 2017-09-21 02:59:34,478 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir /run/zookeeper/version-2 2017-09-21 02:59:34,478 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 205 2017-09-21 02:59:34,481 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :Learner@323] - Getting a diff from the leader 0x50006fec3 2017-09-21 02:59:34,520 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :FileTxnSnapLog@240] - Snapshotting: 0x50006fec3 to /run/zookeeper/version-2/snapshot.50006fec3 2017-09-21 02:59:34,876 [myid:1] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x15e5d10a4b50020 2017-09-21 02:59:34,944 [myid:1] - WARN [QuorumPeer[myid=1]/127.0.0.1:5002 :Follower@118] - Got zxid 0x600000001 expected 0x1 2017-09-21 02:59:35,495 [myid:1] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x15e901612194a6f 2017-09-21 03:00:02,649 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING (n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state) 2017-09-21 03:00:02,650 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING (n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state) 2017-09-21 03:00:03,651 [myid:1] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x15e5304d29b0046 2017-09-21 03:00:03,651 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:5002 :ZooKeeperServer@610] - Invalid session 0x15e5304d29b0046 for client / 127.0.0.1:57028, probably expired 2017-09-21 03:00:03,660 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 (message format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING (n.state), 3 (n.sid), 0x5 (n.peerEpoch) FOLLOWING (my state) 2017-09-21 03:00:04,748 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd1000e, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:11,624 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd10001, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:11,847 [myid:1] - INFO [SyncThread:1:FileTxnLog@199] - Creating new log file: log.6000003ac 2017-09-21 03:00:11,847 [myid:1] - INFO [Snapshot Thread:FileTxnSnapLog@240] - Snapshotting: 0x6000003ab to /run/zookeeper/version-2/snapshot.6000003ab 2017-09-21 03:00:34,482 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd1006a, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:34,536 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd1006c, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:34,645 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd1006e, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:36,104 [myid:1] - INFO [Snapshot Thread:FileTxnSnapLog@240] - Snapshotting: 0x60000072a to /run/zookeeper/version-2/snapshot.60000072a 2017-09-21 03:00:36,138 [myid:1] - INFO [SyncThread:1:FileTxnLog@199] - Creating new log file: log.60000072c 2017-09-21 03:00:36,558 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd1007c, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:37,770 [myid:1] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15ea2604fd10089, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) *From node2:* 2017-09-21 02:56:34,210 [myid:2] - INFO [Snapshot Thread:FileTxnSnapLog@240] - Snapshotting: 0x50006fe73 to /run/zookeeper/version-2/snapshot.50006fe73 2017-09-21 02:56:34,212 [myid:2] - INFO [SyncThread:2:FileTxnLog@199] - Creating new log file: log.50006fe75 2017-09-21 02:57:15,736 [myid:2] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. Removing file: Sep 21, 2017 1:39:07 AM /run/zookeeper/version-2/log.50006f22d Removing file: Sep 21, 2017 1:05:52 AM /run/zookeeper/version-2/log.50006f01d Removing file: Sep 21, 2017 12:47:06 AM /run/zookeeper/version-2/log.50006ec55 Removing file: Sep 21, 2017 1:39:07 AM /run/zookeeper/version-2/snapshot.50006f5df Removing file: Sep 21, 2017 1:05:52 AM /run/zookeeper/version-2/snapshot.50006f22d Removing file: Sep 21, 2017 12:47:06 AM /run/zookeeper/version-2/snapshot.50006f01b 2017-09-21 02:57:15,737 [myid:2] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2017-09-21 02:59:31,513 [myid:2] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x15e5d10a4b50020 2017-09-21 02:59:31,748 [myid:2] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x25e8d4776a2557c, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 02:59:32,350 [myid:2] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x25e8d4776a2557c 2017-09-21 02:59:33,221 [myid:2] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x25e5304d29d003c 2017-09-21 02:59:33,393 [myid:2] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:Learner@108] - Revalidating client: 0x25e5304d29d003b 2017-09-21 02:59:34,272 [myid:2] - WARN [QuorumPeer[myid=2]/127.0.0.1:5002 :Follower@89] - Exception when following the leader java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:153) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786) 2017-09-21 02:59:34,272 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :Follower@166] - shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790) 2017-09-21 02:59:34,273 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :FollowerZooKeeperServer@139] - Shutting down 2017-09-21 02:59:34,273 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :ZooKeeperServer@441] - shutting down 2017-09-21 02:59:34,273 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :FollowerRequestProcessor@105] - Shutting down 2017-09-21 02:59:34,273 [myid:2] - INFO [FollowerRequestProcessor:2:FollowerRequestProcessor@95] - FollowerRequestProcessor exited loop! 2017-09-21 02:59:34,273 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :CommitProcessor@181] - Shutting down 2017-09-21 02:59:34,273 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :FinalRequestProcessor@415] - shutdown of request processor complete 2017-09-21 02:59:34,274 [myid:2] - INFO [CommitProcessor:2:CommitProcessor@150] - CommitProcessor exited loop! 2017-09-21 02:59:34,274 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :SyncRequestProcessor@209] - Shutting down 2017-09-21 02:59:34,274 [myid:2] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@841] - Refusing session request for client /127.0.0.1:59500 as it has seen zxid 0x50006febd our last zxid is 0x0 client must try another server 2017-09-21 02:59:34,274 [myid:2] - INFO [SyncThread:2:SyncRequestProcessor@187] - SyncRequestProcessor exited! 2017-09-21 02:59:34,274 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :QuorumPeer@714] - LOOKING 2017-09-21 02:59:34,274 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fe73 2017-09-21 02:59:34,276 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :FastLeaderElection@815] - New election. My id = 2, proposed zxid=0x50006fec3 2017-09-21 02:59:34,277 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 2 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 02:59:34,277 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 02:59:34,277 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 02:59:34,404 [myid:2] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 2017-09-21 02:59:34,478 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :QuorumPeer@796] - LEADING 2017-09-21 02:59:34,478 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir /run/zookeeper/version-2 2017-09-21 02:59:34,478 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :Leader@358] - LEADING - LEADER ELECTION TOOK - 204 2017-09-21 02:59:34,479 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39016 :LearnerHandler@330] - Follower sid: 1 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@988c127 2017-09-21 02:59:34,479 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39016 :LearnerHandler@385] - Synchronizing with Follower sid: 1 maxCommittedLog=0x50006fec3 minCommittedLog=0x50006fe74 peerLastZxid=0x50006fec3 2017-09-21 02:59:34,480 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39016 :LearnerHandler@462] - Sending DIFF 2017-09-21 02:59:34,520 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39016 :LearnerHandler@522] - Received NEWLEADER-ACK message from 1 2017-09-21 02:59:34,520 [myid:2] - INFO [QuorumPeer[myid=2]/127.0.0.1:5002 :Leader@943] - Have quorum of supporters, sids: [ 1,2 ]; starting up and setting last processed zxid: 0x600000000 2017-09-21 02:59:40,000 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x35e68954661a833, timeout of 5000ms exceeded 2017-09-21 02:59:40,000 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x25e8d4776a2557c, timeout of 5000ms exceeded 2017-09-21 02:59:56,001 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e5304d29b0046, timeout of 20000ms exceeded 2017-09-21 02:59:56,001 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x25e5304d29d003b, timeout of 20000ms exceeded 2017-09-21 02:59:56,001 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x25e5304d29d003c, timeout of 20000ms exceeded 2017-09-21 02:59:56,001 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e5d10a4b50046, timeout of 20000ms exceeded 2017-09-21 03:00:02,649 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message format version), 3 (n.leader), 0x40000e237 (n.zxid), 0xc (n.round), LEADING (n.state), 3 (n.sid), 0x5 (n.peerEpoch) LEADING (my state) 2017-09-21 03:00:03,659 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING (n.state), 3 (n.sid), 0x5 (n.peerEpoch) LEADING (my state) 2017-09-21 03:00:03,698 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39060 :LearnerHandler@330] - Follower sid: 3 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3753377a 2017-09-21 03:00:03,699 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39060 :LearnerHandler@385] - Synchronizing with Follower sid: 3 maxCommittedLog=0x600000086 minCommittedLog=0x50006fe74 peerLastZxid=0x50006fed5 2017-09-21 03:00:03,699 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39060 :LearnerHandler@462] - Sending TRUNC 2017-09-21 03:00:03,709 [myid:2] - INFO [LearnerHandler-/127.0.0.1:39060 :LearnerHandler@522] - Received NEWLEADER-ACK message from 3 2017-09-21 03:00:03,978 [myid:2] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x25ea2604ff80006, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:07,354 [myid:2] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x25ea2604ff80025, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:08,928 [myid:2] - INFO [Snapshot Thread:FileTxnSnapLog@240] - Snapshotting: 0x60000026c to /run/zookeeper/version-2/snapshot.60000026c 2017-09-21 03:00:08,929 [myid:2] - INFO [SyncThread:2:FileTxnLog@199] - Creating new log file: log.60000026e 2017-09-21 03:00:22,000 [myid:2] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x35ea260c1fb002f, timeout of 10000ms exceeded 2017-09-21 03:00:23,900 [myid:2] - INFO [Snapshot Thread:FileTxnSnapLog@240] - Snapshotting: 0x600000489 to /run/zookeeper/version-2/snapshot.600000489 2017-09-21 03:00:23,914 [myid:2] - INFO [SyncThread:2:FileTxnLog@199] - Creating new log file: log.60000048b *From node3:* 2017-09-21 02:52:22,913 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35e68954661a822, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 02:55:23,499 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35e68954661a836, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 02:57:17,081 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. Removing file: Sep 21, 2017 1:34:26 AM /run/zookeeper/version-2/log.50006f20a Removing file: Sep 21, 2017 1:04:15 AM /run/zookeeper/version-2/log.50006effd Removing file: Sep 21, 2017 12:46:00 AM /run/zookeeper/version-2/log.50006ec6c Removing file: Sep 21, 2017 1:34:26 AM /run/zookeeper/version-2/snapshot.50006f55b Removing file: Sep 21, 2017 1:04:15 AM /run/zookeeper/version-2/snapshot.50006f208 Removing file: Sep 21, 2017 12:46:00 AM /run/zookeeper/version-2/snapshot.50006effb 2017-09-21 02:57:17,082 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2017-09-21 02:59:24,260 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35e68954661a852, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,644 [myid:3] - ERROR [LearnerHandler-/127.0.0.1:40958 :LearnerHandler@633] - Unexpected exception causing shutdown while sock still open java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:546) 2017-09-21 03:00:02,645 [myid:3] - WARN [LearnerHandler-/127.0.0.1:40958 :LearnerHandler@646] - ******* GOODBYE /127.0.0.1:40958 ******** 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e901612194a6f, timeout of 5000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x25e8d4776a2557c, timeout of 5000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x35e68954661a833, timeout of 5000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x25e5304d29d003c, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e5d10a4b50020, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x25e5304d29d003b, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e5d10a4b50046, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e5d10a4b50022, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x35e5d10a4de0029, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x15e5d10a4b50021, timeout of 20000ms exceeded 2017-09-21 03:00:02,645 [myid:3] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x35e5d10a4de002a, timeout of 20000ms exceeded 2017-09-21 03:00:02,648 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 2 (n.sid), 0x5 (n.peerEpoch) LEADING (my state) 2017-09-21 03:00:02,648 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message format version), 1 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LEADING (my state) 2017-09-21 03:00:02,648 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LOOKING (n.state), 1 (n.sid), 0x5 (n.peerEpoch) LEADING (my state) 2017-09-21 03:00:02,649 [myid:3] - ERROR [LearnerHandler-/127.0.0.1:58432 :LearnerHandler@633] - Unexpected exception causing shutdown while sock still open java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:546) 2017-09-21 03:00:02,649 [myid:3] - WARN [LearnerHandler-/127.0.0.1:58432 :LearnerHandler@646] - ******* GOODBYE /127.0.0.1:58432 ******** 2017-09-21 03:00:02,651 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x15e5304d29b0046 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:null Error:KeeperErrorCode = Session moved 2017-09-21 03:00:02,651 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15e5d10a4b50020, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,651 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x25e5304d29d003b, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,651 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35e68954661a833, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,652 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x25e5304d29d003c, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,652 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15e5304d29b0046, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,652 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35e5d10a4de0029, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,653 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15e5d10a4b50022, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,653 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35e5d10a4de002a, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,653 [myid:3] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] - Invalid session 0x35e68954661a833 for client /127.0.0.1:46968, probably expired 2017-09-21 03:00:02,653 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15e5d10a4b50021, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:02,654 [myid:3] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] - Invalid session 0x25e5304d29d003c for client /127.0.0.1:46982, probably expired 2017-09-21 03:00:02,654 [myid:3] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] - Invalid session 0x15e5d10a4b50046 for client /127.0.0.1:46984, probably expired 2017-09-21 03:00:02,660 [myid:3] - INFO [NIOServerCxn.Factory:localhost/127.0.0.1:5002:ZooKeeperServer@610] - Invalid session 0x25e5304d29d003b for client /127.0.0.1:46994, probably expired 2017-09-21 03:00:03,648 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :Leader@493] - Shutting down 2017-09-21 03:00:03,648 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :Leader@499] - Shutdown called java.lang.Exception: shutdown Leader! reason: Not sufficient followers synced, only synced with sids: [ 3 ] at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:499) at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:799) 2017-09-21 03:00:03,648 [myid:3] - INFO [Thread-17:Leader$LearnerCnxAcceptor@322] - exception while shutting down acceptor: java.net.SocketException: Socket closed 2017-09-21 03:00:03,648 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :ZooKeeperServer@441] - shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :SessionTrackerImpl@225] - Shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :PrepRequestProcessor@761] - Shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :ProposalRequestProcessor@88] - Shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :CommitProcessor@181] - Shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :Leader$ToBeAppliedRequestProcessor@658] - Shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :FinalRequestProcessor@415] - shutdown of request processor complete 2017-09-21 03:00:03,649 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :SyncRequestProcessor@209] - Shutting down 2017-09-21 03:00:03,649 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop! 2017-09-21 03:00:03,649 [myid:3] - INFO [CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited loop! 2017-09-21 03:00:03,649 [myid:3] - INFO [SyncThread:3:SyncRequestProcessor@187] - SyncRequestProcessor exited! 2017-09-21 03:00:03,652 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :QuorumPeer@714] - LOOKING 2017-09-21 03:00:03,653 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fcf2 2017-09-21 03:00:03,659 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :FastLeaderElection@815] - New election. My id = 3, proposed zxid=0x50006fed5 2017-09-21 03:00:03,659 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message format version), 3 (n.leader), 0x50006fed5 (n.zxid), 0xd (n.round), LOOKING (n.state), 3 (n.sid), 0x5 (n.peerEpoch) LOOKING (my state) 2017-09-21 03:00:03,660 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), LEADING (n.state), 2 (n.sid), 0x6 (n.peerEpoch) LOOKING (my state) 2017-09-21 03:00:03,697 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x50006fec3 (n.zxid), 0xd (n.round), FOLLOWING (n.state), 1 (n.sid), 0x6 (n.peerEpoch) LOOKING (my state) 2017-09-21 03:00:03,697 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :QuorumPeer@784] - FOLLOWING 2017-09-21 03:00:03,697 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /run/zookeeper/version-2 snapdir /run/zookeeper/version-2 2017-09-21 03:00:03,697 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 44 2017-09-21 03:00:03,700 [myid:3] - WARN [QuorumPeer[myid=3]/127.0.0.1:5002 :Learner@338] - Truncating log to get in sync with the leader 0x50006fec3 2017-09-21 03:00:03,702 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :FileSnap@83] - Reading snapshot /run/zookeeper/version-2/snapshot.50006fcf2 2017-09-21 03:00:03,707 [myid:3] - WARN [QuorumPeer[myid=3]/127.0.0.1:5002 :Learner@374] - Got zxid 0x600000001 expected 0x1 2017-09-21 03:00:03,708 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1:5002 :FileTxnSnapLog@240] - Snapshotting: 0x600000086 to /run/zookeeper/version-2/snapshot.600000086 2017-09-21 03:00:03,773 [myid:3] - INFO [SyncThread:3:FileTxnLog@199] - Creating new log file: log.600000087 2017-09-21 03:00:03,776 [myid:3] - WARN [QuorumPeer[myid=3]/127.0.0.1:5002 :Follower@118] - Got zxid 0x600000088 expected 0x1 2017-09-21 03:00:03,895 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35ea260c1fb0002, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:03,978 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@362] - Exception causing close of session 0x35ea260c1fb0003 due to java.io.IOException: Connection reset by peer 2017-09-21 03:00:04,000 [myid:3] - INFO [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop! 2017-09-21 03:00:05,806 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35ea260c1fb0011, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) 2017-09-21 03:00:09,600 [myid:3] - INFO [Snapshot Thread:FileTxnSnapLog@240] - Snapshotting: 0x60000029c to /run/zookeeper/version-2/snapshot.60000029c 2017-09-21 03:00:09,606 [myid:3] - INFO [SyncThread:3:FileTxnLog@199] - Creating new log file: log.60000029e 2017-09-21 03:00:10,399 [myid:3] - WARN [NIOServerCxn.Factory:localhost/127.0.0.1:5002:NIOServerCnxn@357] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x35ea260c1fb002f, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:745) Thanks, Anand. On Thu, Oct 5, 2017 at 9:50 AM, Patrick Hunt <ph...@apache.org> wrote: > Unfortunately I don't see any attached logs, which makes it difficult to > provide you with insight. "Not sufficient followers synced" indicates that > you're losing followers, likely they are falling behind - what is your > metric tracking telling your wrt load on the compute and load on the > disk/memory/network/etc... also metrics at the ZK level (e.g. are zk > latencies increasing?) Check the logs to see if you're seeing "fsync" > slowness issues (it's a warning in the server logs). This is a pretty > common issue. GC might also be an issue, although that's more rare these > days (hard to say w/o knowing your use case, etc...) Again, look to your > metrics collection for insight where to start. > > Patrick > > On Wed, Oct 4, 2017 at 11:17 AM, Anand Parthasarathy < > anpar...@avinetworks.com> wrote: > > > Hi, > > > > We have an issue with a 3-node zookeeper ensemble where the quorum goes > > down due to no apparent reason every once in a while. Here is what I see > in > > the ZK leader: > > > > 2017-09-21 03:00:03,648 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1: > > 5002:Leader@493] - Shutting down > > 2017-09-21 03:00:03,648 [myid:3] - INFO [QuorumPeer[myid=3]/127.0.0.1: > > 5002:Leader@499] - Shutdown called > > java.lang.Exception: shutdown Leader! reason: Not sufficient followers > > synced, only synced with sids: [ 3 ] > > at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader. > java:499) > > at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:474) > > at org.apache.zookeeper.server.quorum.QuorumPeer.run( > > QuorumPeer.java:799) > > > > I have attached the logs from the 3 nodes around this time. Could you > pls. > > help understand what the issue could be here. The only thing I see a > little > > bit ahead of this timestamp is that all of them did a PurgeTask pretty > much > > at the same time. > > > > Thanks, > > Anand. > > >