[ https://issues.apache.org/jira/browse/ZOOKEEPER-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058986#comment-17058986 ]
Dai Shi commented on ZOOKEEPER-3756: ------------------------------------ Ah I forgot one very important file: [^zoo-service.yaml] And actually looking through this file I found out why the original issue was happening :facelpalm: I had a copy/paste error and forgot to update the selectors for zoo-internal-3 and zoo-internal-4 (the peer and leader election ports) when adding the 2 new members, so they were both still pointing to zoo-2. After fixing this error things now behave correctly as they do with a 3 member cluster. Sorry so much to waste your time! However, there is still the outstanding issue where restarting the leader pod causes the cluster to be down for around 30 seconds while it restarts. Is this expected? I was going to spin up a zookeeper cluster outside of kubernetes today just to confirm if it still experiences the same behavior. This is a big issue for running in kubernetes because pods and nodes need to be restarted relatively frequently due to kubernetes upgrades, and our services cannot tolerate a 30 second downtime to zookeeper. Since this is not the issue mentioned at the beginning of this issue, I'm happy to close this and open a new issue if needed. > Members failing to rejoin quorum > -------------------------------- > > Key: ZOOKEEPER-3756 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3756 > Project: ZooKeeper > Issue Type: Improvement > Components: leaderElection > Affects Versions: 3.5.6, 3.5.7 > Reporter: Dai Shi > Assignee: Mate Szalay-Beko > Priority: Major > Attachments: Dockerfile, configmap.yaml, docker-entrypoint.sh, > jmx.yaml, zoo-service.yaml, zookeeper.yaml > > > Not sure if this is the place to ask, please close if it's not. > I am seeing some behavior that I can't explain since upgrading to 3.5: > In a 5 member quorum, when server 3 is the leader and each server has this in > their configuration: > {code:java} > server.1=100.71.255.254:2888:3888:participant;2181 > server.2=100.71.255.253:2888:3888:participant;2181 > server.3=100.71.255.252:2888:3888:participant;2181 > server.4=100.71.255.251:2888:3888:participant;2181 > server.5=100.71.255.250:2888:3888:participant;2181{code} > If servers 1 or 2 are restarted, they fail to rejoin the quorum with this in > the logs: > {code:java} > 2020-03-11 20:23:35,720 [myid:2] - INFO > [QuorumPeer[myid=2](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1175] - > LOOKING > 2020-03-11 20:23:35,721 [myid:2] - INFO > [QuorumPeer[myid=2](plain=0.0.0.0:2181)(secure=disabled):FastLeaderElection@885] > - New election. My id = 2, proposed zxid=0x1b8005f4bba > 2020-03-11 20:23:35,733 [myid:2] - INFO > [WorkerSender[myid=2]:QuorumCnxManager@438] - Have smaller server identifier, > so dropping the connection: (3, 2) > 2020-03-11 20:23:35,734 [myid:2] - INFO > [0.0.0.0/0.0.0.0:3888:QuorumCnxManager$Listener@924] - Received connection > request 100.126.116.201:36140 > 2020-03-11 20:23:35,735 [myid:2] - INFO > [WorkerSender[myid=2]:QuorumCnxManager@438] - Have smaller server identifier, > so dropping the connection: (4, 2) > 2020-03-11 20:23:35,740 [myid:2] - INFO > [WorkerSender[myid=2]:QuorumCnxManager@438] - Have smaller server identifier, > so dropping the connection: (5, 2) > 2020-03-11 20:23:35,740 [myid:2] - INFO > [0.0.0.0/0.0.0.0:3888:QuorumCnxManager$Listener@924] - Received connection > request 100.126.116.201:36142 > 2020-03-11 20:23:35,740 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@679] - Notification: 2 (message > format version), 2 (n.leader), 0x1b8005f4bba (n.zxid), 0x1 (n.round), LOOKING > (n.state), 2 (n.sid), 0x1b8 (n.peerEPoch), LOOKING (my state)0 (n.config > version) > 2020-03-11 20:23:35,742 [myid:2] - WARN > [SendWorker:3:QuorumCnxManager$SendWorker@1143] - Interrupted while waiting > for message on queue > java.lang.InterruptedException > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) > at > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1294) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:82) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1131) > 2020-03-11 20:23:35,744 [myid:2] - WARN > [SendWorker:3:QuorumCnxManager$SendWorker@1153] - Send worker leaving thread > id 3 my id = 2 > 2020-03-11 20:23:35,745 [myid:2] - WARN > [RecvWorker:3:QuorumCnxManager$RecvWorker@1230] - Interrupting > SendWorker{code} > The only way I can seem to get them to rejoin the quorum is to restart the > leader. > However, if I remove server 4 and 5 from the configuration of server 1 or 2 > (so only servers 1, 2, and 3 remain in the configuration file), then they can > rejoin the quorum fine. Is this expected and am I doing something wrong? Any > help or explanation would be greatly appreciated. Thank you. -- This message was sent by Atlassian Jira (v8.3.4#803005)