Anyone has any ideas? Regards, Qian Zhang
On Sun, May 19, 2019 at 6:15 PM Qian Zhang <[email protected]> wrote: > Hi, > > I have a ZooKeeper cluster which has 5 nodes. Today the leader cannot be > connected due to a hardware issue, and then I found the 4 followers just > shutdown, here is the logs: > >> May 18 15:34:28 MD001076 java[29148]: [myid:1] WARN >> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when >> following the leader >> java.net.SocketTimeoutException: >> Read timed out >> at >> java.net.SocketInputStream.socketRead0(Native Method) >> at >> java.net.SocketInputStream.socketRead(SocketInputStream.java:116) >> at >> java.net.SocketInputStream.read(SocketInputStream.java:171) >> at >> java.net.SocketInputStream.read(SocketInputStream.java:141) >> at >> java.io.BufferedInputStream.fill(BufferedInputStream.java:246) >> at >> java.io.BufferedInputStream.read(BufferedInputStream.java:265) >> at >> java.io.DataInputStream.readInt(DataInputStream.java:387) >> at >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) >> at >> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) >> at >> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99) >> at >> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153) >> at >> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) >> at >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:937) >> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO >> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - >> Accepted socket connectio >> n from /10.249.255.10:42306 >> May 18 15:34:28 MD001076 java[29148]: [myid:1] WARN >> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@896] - >> Connection request from old cl >> ient /10.249.255.10:42306; will be dropped if server is in r-o mode >> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO >> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@942] - Client >> attempting to establish >> new session at /10.249.255.10:42306 >> May 18 15:34:28 MD001076 java[29148]: [myid:1] ERROR >> [FollowerRequestProcessor:1:ZooKeeperCriticalThread@49] - Severe >> unrecoverable error, from threa >> d : FollowerRequestProcessor:1 >> java.net.SocketException: Socket >> closed >> at >> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118) >> at >> java.net.SocketOutputStream.write(SocketOutputStream.java:155) >> at >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) >> at >> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) >> at >> org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:139) >> at >> org.apache.zookeeper.server.quorum.Learner.request(Learner.java:188) >> at >> org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:90) >> May 18 15:34:28 MD001076 java[29148]: [myid:1] INFO >> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called >> java.lang.Exception: shutdown >> Follower >> at >> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) >> at >> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:941) > > > I am confused why all followers shutdown in this case which makes the > whole ZooKeeper unusable for a short period, shouldn't they elect a new > leader instead? Thanks! > > > Regards, > Qian Zhang >
