[ 
https://issues.apache.org/jira/browse/OAK-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090676#comment-14090676
 ] 

Chetan Mehrotra commented on OAK-2020:
--------------------------------------

*Case 1 - Spurious Commits  causing Conflicts*

Following testcase demonstrates spurious commits.

{code:java}
public void diffWithConflict() throws Exception{
        //Last rev on /var would be 1-0-1
        createNodes("/var/a", "/var/b/b1");

        //1. Dummy commits to bump the version no
        createNodes("/fake/b");
        createNodes("/fake/c");

        //Root rev = 3-0-1
        //Root rev = 3-0-1

        //2. Create a node under /var/a but hold on commit
        NodeBuilder b1 = ns.getRoot().builder();
        createNodes(b1, "/var/a/a1");

        //3. Remove a node under /var/b and commit it
        NodeBuilder b2 = ns.getRoot().builder();
        b2.child("var").child("b").child("b1").remove();
        merge(b2);

        //4. Now merge and commit the changes in b1 and include conflict hooks
        //For now exception would be thrown
        ns.merge(b1,
                new CompositeHook(
                        new ConflictHook(new AnnotatingConflictHandler()),
                        new EditorHook(new ConflictValidatorProvider())
                ),
                CommitInfo.EMPTY);
    }
{code}

Testcase performs following steps 

# DocumentNodeStore is created with cache disabled and simple rev counter
# Base node structure is create at Rev 1-0-1. Here lastRev of /var is 1-0-1
#* /var/a
#* /var/b/b1
# Some dummy commits are made to increase the rev to 3-0-1
# Session 1  (B1) - Added a new node /var/a/a1. Change yet not merged
# Session 2 (B2) - Removes node at /var/b/b1 and commits it. This sets the 
lastRev of /var to 4-0-1
# Session 1 tries to commit but fails with conflict
bq. Commit failed due to unresolved conflicts in /var/b = {changeDeletedNode = 
{b1}}
{noformat}
org.apache.jackrabbit.oak.api.CommitFailedException: OakState0001: Unresolved 
conflicts in /var/b
        at 
org.apache.jackrabbit.oak.plugins.commit.ConflictValidator.failOnMergeConflict(ConflictValidator.java:84)
        at 
org.apache.jackrabbit.oak.plugins.commit.ConflictValidator.propertyAdded(ConflictValidator.java:54)
        at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:82)
        at 
org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:374)
        at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
        at 
org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:395)
        at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
        at 
org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:395)
        at 
org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
        at 
org.apache.jackrabbit.oak.spi.commit.EditorHook.processCommit(EditorHook.java:54)
        at 
org.apache.jackrabbit.oak.spi.commit.CompositeHook.processCommit(CompositeHook.java:60)
        at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$InMemory.merge(AbstractNodeStoreBranch.java:509)
        at 
org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.merge(AbstractNodeStoreBranch.java:310)
        at 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.merge(DocumentNodeStoreBranch.java:142)
        at 
org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.merge(DocumentRootBuilder.java:159)
        at 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.merge(DocumentNodeStore.java:1291)
        at 
org.apache.jackrabbit.oak.plugins.document.NodeStoreDiffTest.diffWithConflict(NodeStoreDiffTest.java:85)
{noformat}

Ideally this should not have failed as b1 was not removed in B1. Same testcase 
passes when cache is enabled. Now following flow happens when merge is 
performed on B1 then first a rebase is performed in 
AbstractNodeStoreBranch.InMemory#rebase.

{code:java}
@Override
void rebase() {
    N root = getRoot();
    NodeBuilder builder = root.builder();
    head.compareAgainstBaseState(base, new 
ConflictAnnotatingRebaseDiff(builder));
    head = builder.getNodeState();
    base = root;
}
{code}


In that rebase ModifiedNodeState.compareAgainstBaseState is invoked which then 
compares the base states.

{code:java}
public boolean compareAgainstBaseState(
        NodeState base, final NodeStateDiff diff) {
    if (this == base) {
        return true; // no differences
    }

    ...
    return this.base.compareAgainstBaseState(base, new NodeStateDiff() {
    ...
}
{code}


In ideal case {{this.base}} and {{base}} should be same for /var. But thats not 
the case
* this.base - /var, lastRev 1-0-1
* base - /var lastRev 3-0-1

As cache is disabled when DocumentNodeState fetches the /var with readRevision 
set to 3-0-1 (lastRev of root node when B1 was started) then calls is made to 
NodeDocument#getNodeAtRevision. Here the logic decides that at given 
readRevision (3-0-1) how does /var looks like. In doing so it has to set the 
lastRev of returned node. When {{base}}was created then lastRev of /var was 
1-0-1 but the current value of lastRev at /var is 4-0-1. Then in NodeDocument 
logic the lastRev is determined which as per below flow is set to readRev i.e. 
3-0-1. Due to this later when /var/b/b1 is read then there version are found to 
be different also (they are set to readRev) which triggers changeDeletedNode 
clause

{code:java}
 for (Revision r : lastRevs.values()) {
            // ignore if newer than readRevision
            if (isRevisionNewer(nodeStore, r, readRevision)) {
                // the node has a _lastRev which is newer than readRevision
                // this means we don't know when this node was
                // modified by an operation on a descendant node between
                // current lastRevision and readRevision. therefore we have
                // to stay on the safe side and use readRevision
                lastRevision = readRevision;
                continue;
            } 
{code}

*Problem*
As explained above due to way DocumentNodeStore work a node like /var read at 
same rev is fetched with different value which causes diff logic to fail. When 
cache is present and previously read DocumentNodeState is cached then same 
value is returned and hence diff logic works fine. The problem comes because we 
cannot accurately determine the NodeState at previous revision from persisted 
state i.e. as show above though the lastRev of /var should have been 1-0-1 but 
that data is lost and it changes to 3-0-1

*Solution*
Need to work out. For now to reduce the problem from occurring we can increase 
the cache size for nodeCache such that correct data is present in cache. 
Howeever this would not work properly if such reading from old revision is 
performed on a different cluster node say with AsyncIndexUpdate and read there 
at previous version might miss out on some changes. Still need to confirm that 
part.


> NodeState view at given version is not stable with DocumentNodeStore
> --------------------------------------------------------------------
>
>                 Key: OAK-2020
>                 URL: https://issues.apache.org/jira/browse/OAK-2020
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: mongomk
>            Reporter: Chetan Mehrotra
>             Fix For: 1.1
>
>
> DocumentNodeStore should provide same NodeState for given (path, revision)  
> whenever a NodeState is obtained from it. This fails in some cases if the 
> Node cache in DocumentNodeStore gets overflown and then the NodeState 
> returned differs. This causes issues like
> # Spurious Commits see - Some commit would fails with conflicts in paths 
> which were not modified in that commit
> # Diff logic would traverse those paths which are not related to changes done 
> in that commit
> More details would be provided in comments below



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to