[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951381#comment-15951381 ] Alexander Shraer commented on ZOOKEEPER-2076: - Sure, [~atris], thanks for taking this on. BTW, perhaps both items in the description are too much for a single JIRA, we could tackle one of them here and leave the other one for different JIRA(s). > Improve Leader Change Mechanism > --- > > Key: ZOOKEEPER-2076 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.5.0 >Reporter: Alexander Shraer >Assignee: Atri Sharma > > When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a > mechanism where the old leader nominates the new one. Although it reduces the > time for a new leader to be elected, it still takes too long. This JIRA is > for two things: > 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the > handoff. > 2. Make it a first-class citizen & export it as a client API. We get > questions about this once in a while - how do I cause a different leader to > be elected ? Currently the response is either kill or reconfigure the current > leader. > Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950735#comment-15950735 ] Flavio Junqueira commented on ZOOKEEPER-2076: - Go for it, [~atris], I've assigned it to you. > Improve Leader Change Mechanism > --- > > Key: ZOOKEEPER-2076 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.5.0 >Reporter: Alexander Shraer >Assignee: Atri Sharma > > When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a > mechanism where the old leader nominates the new one. Although it reduces the > time for a new leader to be elected, it still takes too long. This JIRA is > for two things: > 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the > handoff. > 2. Make it a first-class citizen & export it as a client API. We get > questions about this once in a while - how do I cause a different leader to > be elected ? Currently the response is either kill or reconfigure the current > leader. > Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950539#comment-15950539 ] Atri Sharma commented on ZOOKEEPER-2076: Hi Folks, Is this still valid? [~shralex] If nobody is working on this, I can take it up > Improve Leader Change Mechanism > --- > > Key: ZOOKEEPER-2076 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.5.0 >Reporter: Alexander Shraer > > When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a > mechanism where the old leader nominates the new one. Although it reduces the > time for a new leader to be elected, it still takes too long. This JIRA is > for two things: > 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the > handoff. > 2. Make it a first-class citizen & export it as a client API. We get > questions about this once in a while - how do I cause a different leader to > be elected ? Currently the response is either kill or reconfigure the current > leader. > Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231859#comment-14231859 ] Alexander Shraer commented on ZOOKEEPER-2076: - See Figure 8 here: http://www.cs.technion.ac.il/~shralex/zkreconfig.pdf I think what I did is just ran an ensemble locally, invoked a reconfig removing the leader and looked on the log, which includes the time. You can add logging if needed, but the current logging should probably be enough to understand when the old leader terminates and when the new one is established to measure total time. (I don't really remember if this is how I did it since it was more than 3 years ago, but this is where I'd suggest to start). Exactly, I believe its possible to do some things more efficiently, but I really haven't thought this through and not familiar with the current mechanism in detail. My current implementation of leader handoff is just an optimization that usually reduces the number of rounds required in FLE to 1. I also suspect that one can even skip FLE completely and have them try to connect to the new leader and only if that fails go back to FLE. Not sure this is worth doing - it depends where the time is spent currently. Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231958#comment-14231958 ] Hongchao Deng commented on ZOOKEEPER-2076: -- My understanding is that leader election will take a long time regardless how close followers are: 1. Leader will take a snapshot on lead(): https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java#L418-418 2. Learner will take a snapshot on receiving NEWLEADER: https://github.com/fengjingchao/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L486-486 While the first one is unnecessary, the second one is introducing bugs... I see the best solution is to fix the problem of taking snapshot. But it's out of the scope here. Any idea on exposing the API of suggestedLeader? Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230735#comment-14230735 ] Hongchao Deng commented on ZOOKEEPER-2076: -- Hi [~shralex]. Can you explain the 1st point: bq. 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230740#comment-14230740 ] Hongchao Deng commented on ZOOKEEPER-2076: -- I am wondering what the details are? :) Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230747#comment-14230747 ] Alexander Shraer commented on ZOOKEEPER-2076: - I don't have a clear idea of what's needed for part 1. But when I measured the latency of the leader handoff it still took about 1 second, even though it should be pretty much immediate. I think this can be improved. The idea of part 1 here is to see where this time is spent and improve if possible. Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230749#comment-14230749 ] Hongchao Deng commented on ZOOKEEPER-2076: -- Would you mind to point out the code where handoff happens? Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230767#comment-14230767 ] Alexander Shraer commented on ZOOKEEPER-2076: - In Leader.java look for designatedLeader - this is where the old leader chooses and announces its replacement. Then Follower.java and Observer.java get this message (also look for designatedleader) and call QuorumPeer.processReconfig() which gets suggestedLeaderId as parameter. Notice the updateVote there. Then the follower throws an exception (because its a major change) and QuorumPeer goes into LOOKING state, which invokes FastLeaderElection, and here is where the initial vote set earlier is used, so they all initially vote for the designated leader, which is supposed to converge quickly. Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2076) Improve Leader Change Mechanism
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230963#comment-14230963 ] Hongchao Deng commented on ZOOKEEPER-2076: -- Hi [~shralex]. Could you do me a favor for two things: 1. share how you measure the time so I can do the same? 2. I wonder if you comment out the line https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java#L418-418 and measure the time again. It takes a snapshot, writes it to *DISK* here. After all, I wonder if there is a way to do a simpler election because we already know they are synced. Improve Leader Change Mechanism --- Key: ZOOKEEPER-2076 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2076 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.0 Reporter: Alexander Shraer When a leader is removed during a reconfiguration, ZOOKEEPER-107 uses a mechanism where the old leader nominates the new one. Although it reduces the time for a new leader to be elected, it still takes too long. This JIRA is for two things: 1. Improve the mechanism, e.g., avoid loading snapshots, etc. during the handoff. 2. Make it a first-class citizen export it as a client API. We get questions about this once in a while - how do I cause a different leader to be elected ? Currently the response is either kill or reconfigure the current leader. Any one interested to work on this ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)