[
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13804344#comment-13804344
]
Germán Blanco commented on ZOOKEEPER-1777:
------------------------------------------
I have received a different suggestion that has less impact. The idea would be
to reserve some bits of the zxid for sanity check (e.g. 12 bits).
That means that the zxid will rollover more often, but the remaining space for
zxid+epoch (51 bits) still should last for more than one hundred years.
This sanity check will be calculated by the leader when increasing the zxid and
it can be e.g. a random number.
When a Follower connects to a leader or a client connects to a server, the
leader and the server will only check if they see this zxid in their
transaction history. If it is not there, then there is a warning and an snap is
sent to the follower or the client connection is closed.
There is no need to modify any protocol or storage with this, as far as I see.
And most likely the biggest impact will be on the test cases. However if this
is a configuration option, it will also be possible to decide to avoid the
failures in some of the test cases.
Any comments or opinions?
> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
> Reporter: Germán Blanco
> Assignee: Germán Blanco
> Priority: Critical
> Fix For: 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch,
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the
> ephemeral nodes that are present in the leader and the other follower.
> The 8 missing nodes in "the follower that is not ok" were created in the end
> of epoch 1, the ensemble is running in epoch 2.
--
This message was sent by Atlassian JIRA
(v6.1#6144)