[
https://issues.apache.org/jira/browse/ZOOKEEPER-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720595#comment-13720595
]
Flavio Junqueira commented on ZOOKEEPER-1732:
---------------------------------------------
bq. joining an ensemble that votes me as the leader.
I'm ok with removing it, this is an optimization. If the leader is being
re-elected, then it means that the ensemble it is trying to join is not
functional, since the leader is not present.
To do it, you might as well check the change of ZOOKEEPER-1514 in checkLeader.
I think the if block you added is not necessary if you make the change in check
leader.
bq. taking into account my own votes or votes that put me as a leader when
joining an ensemble.
I don't think we are currently taking into account the vote of a LOOKING server
when processing FOLLOWING/LEADING notifications. If you're talking about
endVote, this is the vote corresponding to the leader it elected.
bq. removing the check for the election round when joining an established
ensemble.
Let me give some insight here first. We need to have servers joining an
established ensemble because a server may find that a quorum is already
following some leader and if it follows the standard procedure of processing
notifications, then there are some corner cases that can cause it to keep
electing some other server that is also looking.
The danger of joining an established ensemble is the following. Say that a
minority of followers support a leader L, and a majority M supports L'. L' has
enough supporters and is able to commit txns. Now say that a server S in the
ensemble of L' crashes and recovers. S talks to L and its minority now forming
a majority (say there was one server missing to form a majority). L will tell
all servers in its ensemble to truncate causing some txns to be lost.
We have a couple of mechanisms that prevent this incorrect truncation from
happening. First, S needs to receive FOLLOWING/LEADING notifications from a
quorum, not including itself. In this case, the incorrect truncation only
happens if S receives a stale message, a message from a server S' that later on
followed L'. We prevent this case by having maximum one outstanding
notification in QuorumCnxManager in the queue of a peer. If S' has followed L',
then its notification must reflect it and S won't receive such a stale message.
Overall it sounds fine to only consider the server the followers are following.
Note that not only the round could be different, but I believe the zxid could
also be different.
> ZooKeeper server unable to join established ensemble
> ----------------------------------------------------
>
> Key: ZOOKEEPER-1732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1732
> Project: ZooKeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.4.5
> Environment: Windows 7, Java 1.7
> Reporter: Germán Blanco
> Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: test_loosen_restrictions.tar.gz, zklog.tar.gz,
> ZOOKEEPER-1732-LOOSEN_RESTRICTIONS.patch
>
>
> I have a test in which I do a rolling restart of three ZooKeeper servers and
> it was failing from time to time.
> I ran the tests in a loop until the failure came out and it seems that at
> some point one of the servers is unable to join the enssemble formed by the
> other two.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira