[
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787928#comment-13787928
]
Benjamin Reed commented on ZOOKEEPER-1777:
------------------------------------------
i'm curious about the goal here. the scenario is that we have zookeeper servers
that have suffered permanent data losses (since they were using ram disks) and
restart with empty data. in effect they are lying: they are voting as if they
didn't suffer a failure, so our quorum protocols lose their guarantee.
the fix should be to detect the lie and halt. correct?
if you instead detect inconsistent followers and force them to sync up, you may
get consistency in the ensemble, but you may be inconsistent with reality and
with what the clients view is.
> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
> Reporter: Germán Blanco
> Assignee: Germán Blanco
> Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch,
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the
> ephemeral nodes that are present in the leader and the other follower.
> The 8 missing nodes in "the follower that is not ok" were created in the end
> of epoch 1, the ensemble is running in epoch 2.
--
This message was sent by Atlassian JIRA
(v6.1#6144)