[ https://issues.apache.org/jira/browse/OAK-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674124#comment-13674124 ]
Jukka Zitting commented on OAK-853: ----------------------------------- bq. I found that the original problematic call is ... Ah, good, that makes more sense. The basic idea behind {{compareAgainstBaseState()}} is that even though it allows comparisons of arbitrary {{NodeStates}}, the expectation (as encoded in the method name) is that comparison of a state against a previous version of it (earlier revision of a {{KernelNodeState}}, or {{ModifiedNodeState}} against it's base state, etc.) can be done efficiently (i.e. in many cases the {{this.base.compareAgainstBaseState(x, Diff)}} call would be {{x.compareAgainstBaseState(x, Diff)}} which would reduce to just a no-op). I guess in some cases that last optimization wouldn't apply. In that case a better approach than reversing the diff (which kind of invalidates the "compare against previous version" idea) might be to check if {{x instanceof ModifiedNodeState}} and do the {{ModifiedNodeState}}-to-{{ModifiedNodeState}} comparison first before continuing with {{this.base.compareAgainstBaseState(x.base, Diff)}}. > Many child nodes: Diffing causes many calls to MicroKernel.getNodes > ------------------------------------------------------------------- > > Key: OAK-853 > URL: https://issues.apache.org/jira/browse/OAK-853 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core > Reporter: Thomas Mueller > Attachments: OAK-853.patch > > > Creating a flat hierarchy of the following form causes many calls to > MicroKernel.getNodes and is thus slow. > {code} > for (int i = 0; i < 10000; i++) { > root.addNode("test" + i, "nt:folder"); > if (i % 1000 == 0) { > session.save(); > } > } > {code} > As far as I see, this isn't just the case for MicroKernel based storage, but > also for the SegmentNodeStore. The reason seems to be that the optimization > for many child nodes in KernelNodeState.compareAgainstBaseState and > SegmentNodeState.compareAgainstBaseState that avoids iterating over all > children doesn't work. > The optimization uses: > {code} > if (base instanceof SegmentNodeState) ... > if (base instanceof KernelNodeState) ... > {code} > Ideally, the instanceof should be avoided, but I'm not sure how to do that > yet. Anyway, the problem is that "base" is a ModifiedNodeState so no > optimization can be used. > I was thinking, couldn't the ModifiedNodeState do a reverse diff in this > case? That is, inside ModifiedNodeState.compareAgainstBaseState, check if the > "base" parameter is a ModifiedNodeState, and the "base" field is not, then do > a reverse diff, which would be efficient. (We should probably not use "base" > for both the field name and the parameter; well that's a change for another > time.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira