Thomas Mueller created OAK-853:
----------------------------------

             Summary: Many child nodes: Diffing causes many calls to 
MicroKernel.getNodes
                 Key: OAK-853
                 URL: https://issues.apache.org/jira/browse/OAK-853
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: core
            Reporter: Thomas Mueller


Creating a flat hierarchy of the following form causes many calls to 
MicroKernel.getNodes and is thus slow.

{code}
for (int i = 0; i < 10000; i++) {
    root.addNode("test" + i, "nt:folder");
    if (i % 1000 == 0) {
        session.save();
    }
}
{code}

As far as I see, this isn't just the case for MicroKernel based storage, but 
also for the SegmentNodeStore. The reason seems to be that the optimization for 
many child nodes in KernelNodeState.compareAgainstBaseState and 
SegmentNodeState.compareAgainstBaseState that avoids iterating over all 
children doesn't work. 

The optimization uses:

{code}
if (base instanceof SegmentNodeState) ...
if (base instanceof KernelNodeState) ...
{code}

Ideally, the instanceof should be avoided, but I'm not sure how to do that yet. 
Anyway, the problem is that "base" is a ModifiedNodeState so no optimization 
can be used.

I was thinking, couldn't the ModifiedNodeState do a reverse diff in this case? 
That is, inside ModifiedNodeState.compareAgainstBaseState, check if the "base" 
parameter is a ModifiedNodeState, and the "base" field is not, then do a 
reverse diff, which would be efficient. (We should probably not use "base" for 
both the field name and the parameter; well that's a change for another time.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to