[ 
https://issues.apache.org/jira/browse/OAK-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Hoh resolved OAK-11607.
-----------------------------
    Fix Version/s: 1.80.0
       Resolution: Fixed

> Node.getNodes() not lazy for orderable nodetypes
> ------------------------------------------------
>
>                 Key: OAK-11607
>                 URL: https://issues.apache.org/jira/browse/OAK-11607
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.76.0
>            Reporter: Joerg Hoh
>            Assignee: Joerg Hoh
>            Priority: Major
>             Fix For: 1.80.0
>
>
> in AEM we have lot of functionality, which retrieves childnodes, but does not 
> consume all children of that NodeIterator.
> For example we have a function like this:
> {noformat}
>     private boolean hasRelevantChildren(Resource resource) {
>         for (Iterator<Resource> it = resource.listChildren(); it.hasNext(); ) 
> {
>             Resource r = it.next();
>             // don't consider repository nodes (e.g. rep:policy) or content 
> resources as children 
>             if (r.getName().startsWith("rep:") || 
> r.getName().equals(JcrConstants.JCR_CONTENT)
>                     || r.getName().equals(JcrConstants.JCR_FROZENNODE)) {
>                 continue;
>             }
>             return true;
>         }
>         return false;
>     }
> {noformat}
> which normally just reads a few nodes from the iterator. Now I have found a 
> good number of occurrences of stacktraces like this:
> {noformat}
>         at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$2.iterator(DocumentNodeState.java:368)
>         at java.lang.Iterable.forEach([email protected]/Iterable.java:74)
>         at 
> org.apache.jackrabbit.guava.common.collect.Iterables$5.forEach(Iterables.java:748)
>         at 
> org.apache.jackrabbit.guava.common.collect.Iterables$4.forEach(Iterables.java:586)
>         at 
> org.apache.jackrabbit.oak.commons.collections.CollectionUtils.toLinkedSet(CollectionUtils.java:139)
>         at 
> org.apache.jackrabbit.oak.plugins.tree.impl.AbstractTree.getChildNames(AbstractTree.java:129)
>         at 
> org.apache.jackrabbit.oak.plugins.tree.impl.AbstractTree.getChildren(AbstractTree.java:312)
>         at 
> org.apache.jackrabbit.oak.core.MutableTree.getChildren(MutableTree.java:178)
>         at 
> org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.getChildren(NodeDelegate.java:343)
>         at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$8.perform(NodeImpl.java:582)
>         at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$8.perform(NodeImpl.java:578)
>         at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:229)
>         at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
>         at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.getNodes(NodeImpl.java:578)
>         at 
> org.apache.sling.jcr.resource.internal.helper.jcr.JcrNodeResource.listJcrChildren(JcrNodeResource.java:227)
>         at 
> org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.listChildren(JcrResourceProvider.java:404)
>         at 
> org.apache.sling.resourceresolver.impl.providers.stateful.AuthenticatedResourceProvider.listChildren(AuthenticatedResourceProvider.java:169)
>         at 
> org.apache.sling.resourceresolver.impl.helper.ResourceResolverControl.listChildren(ResourceResolverControl.java:297)
>         at 
> org.apache.sling.resourceresolver.impl.ResourceResolverImpl.listChildren(ResourceResolverImpl.java:546)
>         at 
> org.apache.sling.api.resource.AbstractResource.listChildren(AbstractResource.java:91)
>         at 
> org.apache.sling.api.resource.ResourceWrapper.listChildren(ResourceWrapper.java:105)
>         at ...hasRelevantChildren(....java:279)
> {noformat}
> Looking at this stacktrace it makes me think, that that Node.getNode() is not 
> entirely lazy, but at deep in oak.core {{AbstractTree.getChildNames()}} is 
> called, which reads _all_ child names into a Set. Even if the underlying 
> DocumentNodeState itself returns an iterator and would therefor be lazy.
> This means, that for nodetypes with ordered childnodes getting the 
> NodeIterator is an expensive and slow operation (not even iterating over the 
> iterator) if a lot of child nodes are present.
> We should find a way to optimize this case and not read all all childNames 
> already when building the Iterator (and therefor get a lazy semantic).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to