[
https://issues.apache.org/jira/browse/ZOOKEEPER-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812455#comment-13812455
]
Germán Blanco commented on ZOOKEEPER-1805:
------------------------------------------
As far as I can see, there is never a mix of messages with and without don't
care values.
The don't care values never get sent over the network ... or at least that was
not intentional.
I have noticed that the current value (-1) happens to be the same that was
being used by default in Vote.java for some of the incomplete constructors, and
this is why the value does appear in the traces sent by Raúl for the epoch
(note that the epoch was not set to don't care value in this case). But it has
nothing to do with the patch for ZOOKEEPER-1732. You can see that e.g. zxid
does not have a don't care value in these traces.
What your change is doing is that if there is a don't care value, then it
checks if the epoch is greater or equal between the vote with the don't care
value and the other. All votes in the outofelection collection have don't care
values, so the result is that the comparison for the epochs ignores the value
of the epochs in all cases. Epoch may be greater of equal or smaller or equal
for the comparison to be succesful when both votes being compared have don't
care values.
The same result would have been achieved by setting the epoch to the don't care
value when inserting the vote in the outofelection collection (and in the call
to termPredicate) and not making any changes at all in the comparisons in
Vote.java. And in that case also, the changes in learner.java leader.java and
QuorumPeer.java are not good for anything any more, since all they do is
setting the value of the epoch to a common value in Learners and Leader and
that value is going to be ignored. That would be the approach that I would be
taking to implement your proposal. For a test case, it would be enough to
modify the test case added in ZOOKEEPER-1732 and just set the peerEpoch to any
value, so that it is clear that this value is also ignored in the comparison.
But as far as I can see, the current patch has the same behaviour, and the last
decision of how to code behaviours is yours, so both solutions to this problem
are fine for me.
If the decision was mine, I would go for setting epoch to newEpoch-1. Which
might be (arguably) a bit hacky, but the hackery is actually only covering the
case of the upgrade and it doesn't have any effect in other cases. Ignoring the
epoch applies to all cases in which a new server joins an established ensemble
and it might have (at least) the problem of votes of ensembles established with
different epochs to be taken into account as if they belonged to the same
ensemble. I don't like that too much, but failures don't seem likely and they
might not cause problems, since even if the new server joins the wrong leader,
this leader will not process any transaction unless it has acks from sufficient
followers. So the potential problem seems to be only an small possibility of a
delay when joining the right ensemble. That means both (newEpoch-1 and ignoring
epoch) look to me as working solutions.
Sorry if that was too long, but I think it summarises all corners of my
personal view of this issue. The short summary is "I am ok with this solution".
If you want a patch with my alternative implementation of the option of
ignoring the epoch, I can also prepare that.
> "Don't care" value in ZooKeeper election breaks rolling upgrades
> ----------------------------------------------------------------
>
> Key: ZOOKEEPER-1805
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1805
> Project: ZooKeeper
> Issue Type: Bug
> Reporter: Flavio Junqueira
> Assignee: Flavio Junqueira
> Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: ZOOKEEPER-1805-b3.4.patch, ZOOKEEPER-1805.patch,
> ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch,
> ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch, ZOOKEEPER-1805.patch
>
>
> This is an issue that has been originally reported in ZOOKEEPER-1732.
--
This message was sent by Atlassian JIRA
(v6.1#6144)