[ https://issues.apache.org/jira/browse/RATIS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301135#comment-16301135 ]
Tsz Wo Nicholas Sze commented on RATIS-178: ------------------------------------------- {code} @Test public void testLateServerStart() throws Exception { final int numServer = 3; LOG.info("Running testLateServerStart"); final MiniRaftCluster cluster = newCluster(numServer); cluster.initServers(); // start all except one servers final Iterator<RaftServerProxy> i = cluster.getServers().iterator(); for(int j = 1; j < numServer; j++) { i.next().start(); } final RaftServerImpl leader = waitForLeader(cluster); TimeUnit.SECONDS.sleep(10); // start the last server final RaftServerProxy lastServer = i.next(); lastServer.start(); final RaftPeerId lastServerLeaderId = JavaUtils.attempt( () -> getLeader(lastServer.getImpl().getState()), 10, 1000, "getLeaderId", LOG); Assert.assertEquals(leader.getId(), lastServerLeaderId); } static RaftPeerId getLeader(ServerState state) { final RaftPeerId leader = state.getLeaderId(); if (leader == null) { throw new IllegalStateException("No leader yet"); } return leader; } {code} The test above can reproduce the bug. The last server s2 may not able to join the group. s2 keeps starting a leader election but s0 and s1 keep withhold the vote. {code} 2017-12-22 16:48:13,621 INFO impl.RaftServerImpl (RaftServerImpl.java:requestVote(622)) - s0 Withhold vote from server s2 with term 1. This server: LEADER group-C2DF75108086 s0:t2, leader=s0, voted=s0, raftlog=[(t:2, i:0)], conf=[s0:0.0.0.0:55968, s1:0.0.0.0:55969, s2:0.0.0.0:55970], old=null RUNNING, last rpc time from leader s0 is -1 2017-12-22 16:48:13,621 INFO impl.RaftServerImpl (RaftServerImpl.java:requestVote(622)) - s1 Withhold vote from server s2 with term 1. This server:FOLLOWER group-C2DF75108086 s1:t2, leader=s0, voted=s0, raftlog=[(t:2, i:0)], conf=[s0:0.0.0.0:55968, s1:0.0.0.0:55969, s2:0.0.0.0:55970], old=null RUNNING, last rpc time from leader s0 is 11137ms {code} > The third server cannot join the raft group > ------------------------------------------- > > Key: RATIS-178 > URL: https://issues.apache.org/jira/browse/RATIS-178 > Project: Ratis > Issue Type: Bug > Reporter: Tsz Wo Nicholas Sze > > When two servers starts in a 3-server group, they may elect a leader and then > start the service. Then, start the third server. It somehow fails to join > the group. -- This message was sent by Atlassian JIRA (v6.4.14#64029)