[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13824644#comment-13824644
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1573:
---------------------------------------------------

Nit - maybe this:

{noformat}
+         * Snapshots are lazily created. So when the snapshot was in progress
+         * there is a chance that some of the later transactions can go into
+         * snapshot. While restoring same transactions NONODE/NODEEXISTS errors
+         * can come. Basically we can ignore all errors during the restore.
{noformat}

could be more clear like this:

{noformat}
+         * Snapshots are lazily created. So when a snapshot is in progress,
+         * there is a chance for later transactions to make to into the 
snapshot.
+         * Then when the snapshot is restored,  NONODE/NODEEXISTS errors
+         * could occur. It should be safe to ignore these.
{noformat}

Nit:

{noformat}
+                LOG.warn("Intrrupted");
{noformat}

typo.

Nit:
{noformat}
+            LOG.debug("Ignoring processTxn failure hdr: " + hdr.getType() + " 
: error: " + rc.err + " path: " + rc.path);
{noformat}

use string extrapolation with {} instead of string concatenation. 

Nit:
{noformat}
+    /**
+     * Test we can restore a snapshot that has delete txns ahead of the zxid 
of the snapshot file. ZOOKEEPER-1573
+     */
{noformat}

make it:

{noformat}
+    /**
+     * ZOOKEEPER-1573: test restoring a snapshot with deleted txns ahead of 
the snapshot file's zxid. 
+     */
{noformat}

Nit:
{noformat}
+        LOG.info("Set lastProcessedZxid to " + 
zks.getZKDatabase().getDataTreeLastProcessedZxid());
{noformat}

ditto wrt to string extrapolation via {}.



> Unable to load database due to missing parent node
> --------------------------------------------------
>
>                 Key: ZOOKEEPER-1573
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1573
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.5.0
>            Reporter: Thawan Kooburat
>         Attachments: ZOOKEEPER-1573.patch
>
>
> While replaying txnlog on data tree, the server has a code to detect missing 
> parent node. This code block was last modified as part of ZOOKEEPER-1333. In 
> our production, we found a case where this check is return false positive.
> The sequence of txns is as follows:
> zxid 1:  create /prefix/a
> zxid 2:  create /prefix/a/b
> zxid 3:  delete /prefix/a/b
> zxid 4:  delete /prefix/a
> The server start capturing snapshot at zxid 1. However, by the time it 
> traversing the data tree down to /prefix, txn 4 is already applied and 
> /prefix have no children. 
> When the server restore from snapshot, it process txnlog starting from zxid 
> 2. This txn generate missing parent error and the server refuse to start up.
> The same check allow me to discover bug in ZOOKEEPER-1551, but I don't know 
> if we have any option beside removing this check to solve this issue.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to