[ 
https://issues.apache.org/jira/browse/OAK-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674124#comment-13674124
 ] 

Jukka Zitting commented on OAK-853:
-----------------------------------

bq. I found that the original problematic call is ...

Ah, good, that makes more sense.

The basic idea behind {{compareAgainstBaseState()}} is that even though it 
allows comparisons of arbitrary {{NodeStates}}, the expectation (as encoded in 
the method name) is that comparison of a state against a previous version of it 
(earlier revision of a {{KernelNodeState}}, or {{ModifiedNodeState}} against 
it's base state, etc.) can be done efficiently (i.e. in many cases the 
{{this.base.compareAgainstBaseState(x, Diff)}} call would be 
{{x.compareAgainstBaseState(x, Diff)}} which would reduce to just a no-op).

I guess in some cases that last optimization wouldn't apply. In that case a 
better approach than reversing the diff (which kind of invalidates the "compare 
against previous version" idea) might be to check if {{x instanceof 
ModifiedNodeState}} and do the {{ModifiedNodeState}}-to-{{ModifiedNodeState}} 
comparison first before continuing with 
{{this.base.compareAgainstBaseState(x.base, Diff)}}.
                
> Many child nodes: Diffing causes many calls to MicroKernel.getNodes
> -------------------------------------------------------------------
>
>                 Key: OAK-853
>                 URL: https://issues.apache.org/jira/browse/OAK-853
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>            Reporter: Thomas Mueller
>         Attachments: OAK-853.patch
>
>
> Creating a flat hierarchy of the following form causes many calls to 
> MicroKernel.getNodes and is thus slow.
> {code}
> for (int i = 0; i < 10000; i++) {
>     root.addNode("test" + i, "nt:folder");
>     if (i % 1000 == 0) {
>         session.save();
>     }
> }
> {code}
> As far as I see, this isn't just the case for MicroKernel based storage, but 
> also for the SegmentNodeStore. The reason seems to be that the optimization 
> for many child nodes in KernelNodeState.compareAgainstBaseState and 
> SegmentNodeState.compareAgainstBaseState that avoids iterating over all 
> children doesn't work. 
> The optimization uses:
> {code}
> if (base instanceof SegmentNodeState) ...
> if (base instanceof KernelNodeState) ...
> {code}
> Ideally, the instanceof should be avoided, but I'm not sure how to do that 
> yet. Anyway, the problem is that "base" is a ModifiedNodeState so no 
> optimization can be used.
> I was thinking, couldn't the ModifiedNodeState do a reverse diff in this 
> case? That is, inside ModifiedNodeState.compareAgainstBaseState, check if the 
> "base" parameter is a ModifiedNodeState, and the "base" field is not, then do 
> a reverse diff, which would be efficient. (We should probably not use "base" 
> for both the field name and the parameter; well that's a change for another 
> time.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to