[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532633#comment-13532633
 ] 

Edward Ribeiro commented on ZOOKEEPER-1404:
-------------------------------------------

Unfortunately, letting the SEQUENCE number be the highest, instead of the 
smallest, lend it itself to a scenario that is both unstable, more complex, and 
with more operations. Consider the following scenarios where the highest number 
identifies the leader.

A server connect and creates a sequential-ephemeral node. It's the first one, 
so it elects itself the leader. Following that a couple of servers connect and 
each one will have the largest number, even if for very a brief period of time, 
so the leadership will start to "hop" from one server to the other until it 
stabilizes. This generates a couple of net messages and watch setup/delivery.
 
Furthermore, a servers looses its connection and connects again, it will 
"usurp" the leadership even if only the connection of this specific is 
troublesome and transient. In a super stable server scenario, this will not be 
a problem (after the initial burst of leader elections), but the number of 
messages send and received (and watches setup) will be considerably higher. But 
on a faulty scenario, this will cause a lot of serious liveness problems.

On the other hand, a server with the lowest number would probably stay on-line 
for a longer period of time, and once elected it doesn't need to change. If the 
leader looses the connection, the second most oldest will take the place. There 
can be "n" new connections, but the leader will stay stable and well known for 
a longer period of time.
                
> leader election pseudo code probably incorrect
> ----------------------------------------------
>
>                 Key: ZOOKEEPER-1404
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1404
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 3.4.3
>            Reporter: Robert Varga
>
> The pseudo code for leader election in the recipes.html page of 3.4.3 
> documentation is the following...
> {quote}
> Let ELECTION be a path of choice of the application. To volunteer to be a 
> leader: 
> 1.Create znode z with path "ELECTION/guid-n_" with both SEQUENCE and 
> EPHEMERAL flags;
> 2.Let C be the children of "ELECTION", and i be the sequence number of z;
> 3.Watch for changes on "ELECTION/guid-n_j", where j is the 
> {color:red}*smallest*{color} sequence number such that j < i and n_j is a 
> znode in C;
> Upon receiving a notification of znode deletion: 
> 1.Let C be the new set of children of ELECTION; 
> 2.If z is the smallest node in C, then execute leader procedure;
> 3.Otherwise, watch for changes on "ELECTION/guid-n_j", where j is the 
> {color:red}*smallest*{color} sequence number such that j < i and n_j is a 
> znode in C; 
> {quote}
> I think, in both third steps *highest* should appear instead of 
> {color:red}*smallest*{color}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to