Enis Soztutar created ZOOKEEPER-2386: ----------------------------------------
Summary: Cannot achieve quorum when middle server (in a q of 3) is unreacable Key: ZOOKEEPER-2386 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2386 Project: ZooKeeper Issue Type: Bug Reporter: Enis Soztutar Recently, we've observed a curious case where a quorum was not reached for days in a cluster of 3 nodes (zk0, zk1, zk2) and the middle node zk1 is unreachable from network. The leader election happens, and both zk0 and zk2 starts the vote. Then each server sends notifications to every other server including itself. The problem is that, zk1 vm is unavailable, so when we are trying to open up a socket to connect to that server with socket timeout of 5 seconds, it delays the notification processing of the vote sent from zk2 to zk2 (itself). The vote eventually comes after 5 sec, but by that time, the follower (zk0) already converted to the follower state. On the follower state, the follower try to connect to leader 5 times with 1 second timeout (5 sec in total). Since the leader does not start its peer port for 5 seconds after the follower starts, the follower always times out connecting to the leader. This cycle is repeating for hours / days even after restarting the servers several times. I believe this is related to the default timeouts (5 sec socket timeout) and follower to leader connection timeout (5 tries with 1 second timeout). Only after setting the {{zookeeper.cnxTimeout}} to 1 second, the quorum was operating. More logs coming shortly. zoo.cfg: {code} server.3=zk2-hostname:2889:3889 server.2=zk1-hostname:2889:3889 server.1=zk0-hostname:2889:3889 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)