[
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787959#comment-13787959
]
Germán Blanco commented on ZOOKEEPER-1777:
------------------------------------------
Thanks a lot.
I am very focused on the particular way in which ZooKeeper is used in my case
and often I lose the general picture.
The fix proposed in the patch is to solve the inconsistency and continue
working. This is because in my case, as one as there is a single picture of
reality it is not that important if that picture is missing a part of the
history. It is on the other hand tremendously important that the ZooKeeper
service is not interrupted.
I understand that this might not be the case for many others.
So please help me find a solution that fits the general case, if that is
possible. I can also understand that there might be no solution that fits the
general case.
It would be possible to extend the checking of the history to clients. That
will help to detect the problem. The thing is that there might be solutions in
which clients just need to be restarted and in others this might be a fatal
error that should never happen and the best action is a system halt and a huge
error report.
I am linking this to ZOOKEEPER-832, since they both deal with these case that
only happen if the quorum is broken, and they both require a decision of what
(if anything) to do in those cases.
> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
> Reporter: Germán Blanco
> Assignee: Germán Blanco
> Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777-3.4.patch,
> ZOOKEEPER-1777.patch, ZOOKEEPER-1777.patch, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the
> ephemeral nodes that are present in the leader and the other follower.
> The 8 missing nodes in "the follower that is not ok" were created in the end
> of epoch 1, the ensemble is running in epoch 2.
--
This message was sent by Atlassian JIRA
(v6.1#6144)