gaoxiao created ZOOKEEPER-1492: ---------------------------------- Summary: leader cannot switch to LOOKING state when lost the majority Key: ZOOKEEPER-1492 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1492 Project: ZooKeeper Issue Type: Bug Components: quorum Affects Versions: 3.4.3 Environment: eclipse linux Reporter: gaoxiao Priority: Critical
When a follower leave the cluster, and the cluster cannot achieve a majority, the leader should get out from Leading stat and get into Looking state, but if the there are some observers, the leader will not get away and the client cannot use the cluster. eg: The servers config: server.1=z1:2888:3888 server.2=z2:2888:3888 server.3=z3:2888:3888:observer At first, 1,2,3 are all started, it's all ok, 2 is the leader, but at this time, if 1 is stopped, 2 will not leave the Leading state, and client cannot connect to cluster. I think the problem is: (Leader.java method:lead) Line 388-407 syncedSet.add(self.getId()); synchronized (learners) { for (LearnerHandler f : learners) { if (f.synced()) { syncedCount++; syncedSet.add(f.getSid()); } f.ping(); } } if (!tickSkip && !self.getQuorumVerifier().containsQuorum(syncedSet)) { //if (!tickSkip && syncedCount < self.quorumPeers.size() / 2) { // Lost quorum, shutdown // TODO: message is wrong unless majority quorums used shutdown("Only " + syncedCount + " followers, need " + (self.getVotingView().size() / 2)); // make sure the order is the same! // the leader goes to looking return; } The code add all learners' ping to syncedSet, and I think at this place, only followers should be added to syncedSet, so the method 'containsQuorum' can figure out the majority. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira