[jira] Commented: (ZOOKEEPER-275) Bug in FastLeaderElection
[ https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12669568#action_12669568 ] Hudson commented on ZOOKEEPER-275: -- Integrated in ZooKeeper-trunk #217 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/217/]) Bug in FastLeaderElection - Key: ZOOKEEPER-275 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275 Project: Zookeeper Issue Type: Bug Components: leaderElection Affects Versions: 3.0.0, 3.0.1 Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.1.0 Attachments: ZOOKEEPER-275.patch, ZOOKEEPER-275.patch, ZOOKEEPER-275.patch, ZOOKEEPER-275.patch I found an execution in which leader election does not make progress. Here is the problematic scenario: - We have an ensemble of 3 servers, and we start only 2; - We let them elect a leader, and then crash the one with lowest id, say S_1 (call the other S_2); - We restart the crashed server. Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its logical clock set to 1. Once S_1 receives a notification from S_2, it notices that it is in the wrong round and it advances its logical clock to the same value as S_1. Now, the problem comes exactly in this point because in the current code S_1 resets its vote to its initial vote (its own id and zxid). Since S_2 has already notified S_1, it won't do it again, and we are stuck. The patch I'm submitting fixes this problem by setting the vote of S_1 to the one received if it satisfies the total order predicate (received zxid is higher or received zxid is the same and received id is higher). Related to this problem, I noticed that by trying to avoid unnecessary notification duplicates, there could be scenarios in which a server fails before electing a leader and restarts before leader election succeeds. This could happen, for example, when there isn't enough servers available and one available crashes and restarts. I fixed this problem in the attached patch by allowing a server to send a new batch of notifications if there is at least one outgoing queue of pending notifications empty. This is ok because we space out consecutive batches of notifications. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-275) Bug in FastLeaderElection
[ https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666934#action_12666934 ] Flavio Paiva Junqueira commented on ZOOKEEPER-275: -- kills means calling the shutdown method of QuorumCnxManager.Listener. I referred to kill because this is how I reproduce the problem out of unit tests, by just killing the server (ctrl-c). Now, we do close the socket there, but from I found on the net, calling close and having it returning doesn't mean that the port is released. This patch is good for me, but it doesn't include the unit test for the case we are discussing. Although I think it is trivial the correction to the problem pointed out, having a unit tests would prevent us from making the same mistake in the future. So, I'd like to submit the current patch for review and perhaps open another JIRA for the unit test if you think it is worth having it. Bug in FastLeaderElection - Key: ZOOKEEPER-275 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275 Project: Zookeeper Issue Type: Bug Components: leaderElection Affects Versions: 3.0.0, 3.0.1 Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.1.0 Attachments: ZOOKEEPER-275.patch I found an execution in which leader election does not make progress. Here is the problematic scenario: - We have an ensemble of 3 servers, and we start only 2; - We let them elect a leader, and then crash the one with lowest id, say S_1 (call the other S_2); - We restart the crashed server. Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its logical clock set to 1. Once S_1 receives a notification from S_2, it notices that it is in the wrong round and it advances its logical clock to the same value as S_1. Now, the problem comes exactly in this point because in the current code S_1 resets its vote to its initial vote (its own id and zxid). Since S_2 has already notified S_1, it won't do it again, and we are stuck. The patch I'm submitting fixes this problem by setting the vote of S_1 to the one received if it satisfies the total order predicate (received zxid is higher or received zxid is the same and received id is higher). Related to this problem, I noticed that by trying to avoid unnecessary notification duplicates, there could be scenarios in which a server fails before electing a leader and restarts before leader election succeeds. This could happen, for example, when there isn't enough servers available and one available crashes and restarts. I fixed this problem in the attached patch by allowing a server to send a new batch of notifications if there is at least one outgoing queue of pending notifications empty. This is ok because we space out consecutive batches of notifications. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-275) Bug in FastLeaderElection
[ https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666786#action_12666786 ] Patrick Hunt commented on ZOOKEEPER-275: What do you mean by kill qcnxmanager? During the kill is the code explicitly closing the port? It might be that the socket isn't being closed explicitly? (relies on gc?) Ensure that the code will explicitly close the port if killed. If you're done on this issue you might consider submitting the patch for review. Bug in FastLeaderElection - Key: ZOOKEEPER-275 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275 Project: Zookeeper Issue Type: Bug Components: leaderElection Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.1.0 Attachments: ZOOKEEPER-275.patch I found an execution in which leader election does not make progress. Here is the problematic scenario: - We have an ensemble of 3 servers, and we start only 2; - We let them elect a leader, and then crash the one with lowest id, say S_1 (call the other S_2); - We restart the crashed server. Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its logical clock set to 1. Once S_1 receives a notification from S_2, it notices that it is in the wrong round and it advances its logical clock to the same value as S_1. Now, the problem comes exactly in this point because in the current code S_1 resets its vote to its initial vote (its own id and zxid). Since S_2 has already notified S_1, it won't do it again, and we are stuck. The patch I'm submitting fixes this problem by setting the vote of S_1 to the one received if it satisfies the total order predicate (received zxid is higher or received zxid is the same and received id is higher). Related to this problem, I noticed that by trying to avoid unnecessary notification duplicates, there could be scenarios in which a server fails before electing a leader and restarts before leader election succeeds. This could happen, for example, when there isn't enough servers available and one available crashes and restarts. I fixed this problem in the attached patch by allowing a server to send a new batch of notifications if there is at least one outgoing queue of pending notifications empty. This is ok because we space out consecutive batches of notifications. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.