[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786083#comment-13786083
 ] 

Flavio Junqueira commented on ZOOKEEPER-1777:
---------------------------------------------

Ok, thanks for the clarification. Here are my current thoughts then. In the A, 
B, C scenario  above, I would have expected that A truncates its history and 
adopts the state of  B, C, losing some transactions. But, according to you 
observations, A is not getting some of the new transactions of B, C, which does 
not correspond to my expectation, so that's the only thing I believe we would 
need to fix if anything.

The thing that throws me is that in any case your data is broken! I'm not 
exactly sure why it matters to have a consistent ensemble with a broken state. 
One point you made is that we could try to detect the problem so that A decides 
not to make further progress. Unfortunately, A can't determine whether the 
transactions it has and B, C don't have been committed or not. The txn log only 
contains accepted txns. A cannot distinguish this run from one in which A is 
the only one that has accepted these txns.

Given that we are discussing an invalid scenario, I'd like to propose that we 
lower the severity of this issue to major or critical. This means that it is 
not a blocker for 3.4.6, but we could still try to get this in if we can 
converge to a solution. 

> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1777
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.5
>         Environment: Linux, Java 1.7
>            Reporter: Germán Blanco
>            Assignee: Germán Blanco
>            Priority: Blocker
>             Fix For: 3.4.6, 3.5.0
>
>         Attachments: logs_trunk.tar.gz, snaps.tar, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to