[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139879#comment-13139879
 ] 

Camille Fournier commented on ZOOKEEPER-1264:
---------------------------------------------

OK, I found the bug. Ben, we could use your attention here.

The problem is that we queue NEWLEADER before we queue UPTODATE, but inbetween 
these messages we send more sync packets to move us from SNAP to, well, 
UPTODATE. These get written directly to the data tree, bypassing the log. But 
if you immediately shut down the ZK before snapshotting again, you will lose 
any record of these transactions on the ZK in question. It seems to me that we 
should either snapshot again on UPTODATE or else wait to snapshot in the first 
place until that packet is sent. I don't understand why we moved to snapshot on 
NEWLEADER in the first place. If one of the ZAB 1.0 authors could comment, that 
would be useful.
                
> FollowerResyncConcurrencyTest failing intermittently
> ----------------------------------------------------
>
>                 Key: ZOOKEEPER-1264
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1264
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: tests
>    Affects Versions: 3.3.3, 3.4.0, 3.5.0
>            Reporter: Patrick Hunt
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1264.patch, ZOOKEEPER-1264_branch33.patch, 
> ZOOKEEPER-1264_branch34.patch, followerresyncfailure_log.txt.gz, logs.zip, 
> tmp.zip
>
>
> The FollowerResyncConcurrencyTest test is failing intermittently. 
> saw the following on 3.4:
> {noformat}
> junit.framework.AssertionFailedError: Should have same number of
> ephemerals in both followers expected:<11741> but was:<14001>
>        at 
> org.apache.zookeeper.test.FollowerResyncConcurrencyTest.verifyState(FollowerResyncConcurrencyTest.java:400)
>        at 
> org.apache.zookeeper.test.FollowerResyncConcurrencyTest.testResyncBySnapThenDiffAfterFollowerCrashes(FollowerResyncConcurrencyTest.java:196)
>        at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to