[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825014#comment-13825014
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1573:
---------------------------------------------------

Last nit (though feel free to ignore it since it refers to improving old code 
as well):

{noformat}
+
+        long start = System.currentTimeMillis();
+        while (!connected) {
+            long end = System.currentTimeMillis();
+            if (end - start > 5000) {
+                Assert.assertTrue("Could not connect with server in 5 seconds",
+                        false);
+            }
+            try {
+                Thread.sleep(200);
+            } catch (Exception e) {
+                LOG.warn("Interrupted");
+            }
+        }
{noformat}

this is copy/pasted for two other tests as well - can we move it to a method 
called waitConnected and call that instead? It'll make tests shorted and more 
readable I think. 


> Unable to load database due to missing parent node
> --------------------------------------------------
>
>                 Key: ZOOKEEPER-1573
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1573
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.5.0
>            Reporter: Thawan Kooburat
>         Attachments: ZOOKEEPER-1573.patch, ZOOKEEPER-1573.patch, 
> ZOOKEEPER-1573.patch
>
>
> While replaying txnlog on data tree, the server has a code to detect missing 
> parent node. This code block was last modified as part of ZOOKEEPER-1333. In 
> our production, we found a case where this check is return false positive.
> The sequence of txns is as follows:
> zxid 1:  create /prefix/a
> zxid 2:  create /prefix/a/b
> zxid 3:  delete /prefix/a/b
> zxid 4:  delete /prefix/a
> The server start capturing snapshot at zxid 1. However, by the time it 
> traversing the data tree down to /prefix, txn 4 is already applied and 
> /prefix have no children. 
> When the server restore from snapshot, it process txnlog starting from zxid 
> 2. This txn generate missing parent error and the server refuse to start up.
> The same check allow me to discover bug in ZOOKEEPER-1551, but I don't know 
> if we have any option beside removing this check to solve this issue.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to