[
https://issues.apache.org/jira/browse/OAK-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674124#comment-13674124
]
Jukka Zitting commented on OAK-853:
-----------------------------------
bq. I found that the original problematic call is ...
Ah, good, that makes more sense.
The basic idea behind {{compareAgainstBaseState()}} is that even though it
allows comparisons of arbitrary {{NodeStates}}, the expectation (as encoded in
the method name) is that comparison of a state against a previous version of it
(earlier revision of a {{KernelNodeState}}, or {{ModifiedNodeState}} against
it's base state, etc.) can be done efficiently (i.e. in many cases the
{{this.base.compareAgainstBaseState(x, Diff)}} call would be
{{x.compareAgainstBaseState(x, Diff)}} which would reduce to just a no-op).
I guess in some cases that last optimization wouldn't apply. In that case a
better approach than reversing the diff (which kind of invalidates the "compare
against previous version" idea) might be to check if {{x instanceof
ModifiedNodeState}} and do the {{ModifiedNodeState}}-to-{{ModifiedNodeState}}
comparison first before continuing with
{{this.base.compareAgainstBaseState(x.base, Diff)}}.
> Many child nodes: Diffing causes many calls to MicroKernel.getNodes
> -------------------------------------------------------------------
>
> Key: OAK-853
> URL: https://issues.apache.org/jira/browse/OAK-853
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core
> Reporter: Thomas Mueller
> Attachments: OAK-853.patch
>
>
> Creating a flat hierarchy of the following form causes many calls to
> MicroKernel.getNodes and is thus slow.
> {code}
> for (int i = 0; i < 10000; i++) {
> root.addNode("test" + i, "nt:folder");
> if (i % 1000 == 0) {
> session.save();
> }
> }
> {code}
> As far as I see, this isn't just the case for MicroKernel based storage, but
> also for the SegmentNodeStore. The reason seems to be that the optimization
> for many child nodes in KernelNodeState.compareAgainstBaseState and
> SegmentNodeState.compareAgainstBaseState that avoids iterating over all
> children doesn't work.
> The optimization uses:
> {code}
> if (base instanceof SegmentNodeState) ...
> if (base instanceof KernelNodeState) ...
> {code}
> Ideally, the instanceof should be avoided, but I'm not sure how to do that
> yet. Anyway, the problem is that "base" is a ModifiedNodeState so no
> optimization can be used.
> I was thinking, couldn't the ModifiedNodeState do a reverse diff in this
> case? That is, inside ModifiedNodeState.compareAgainstBaseState, check if the
> "base" parameter is a ModifiedNodeState, and the "base" field is not, then do
> a reverse diff, which would be efficient. (We should probably not use "base"
> for both the field name and the parameter; well that's a change for another
> time.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira