[ 
https://issues.apache.org/jira/browse/SLING-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259643#comment-17259643
 ] 

Miroslav Smiljanic commented on SLING-10011:
--------------------------------------------

I have created [benchmark 
test|https://github.com/apache/jackrabbit-oak/commit/5dfd1145a0b71c4a1626b1fc6430b3ec40e9c4c0]
 in Oak to compare performance of Node API vs Session API in retrieving the 
parent node.

Test run below utilises in-memory cache (100MB by default).
{noformat}
> java -jar target/oak-benchmarks-*-SNAPSHOT.jar benchmark  
> GetParentNodeWithNodeAPI  GetParentNodeWithSessionAPI  Oak-Segment-Tar
Apache Jackrabbit Oak 1.37-SNAPSHOT
# GetParentNodeWithNodeAPI         C     min     10%     50%     90%     max    
 N       mean 
Oak-Segment-Tar                    1       2       2       2       3      5   
25891       2
# GetParentNodeWithSessionAP       C     min     10%     50%     90%     max    
 N       mean 
Oak-Segment-Tar                    1      26      27      29      32     40    
2069      29
{noformat}
Parent node retrieval operation for 90% of executions is ~10 times slower when 
using Session API. 

Second run is with disabled cache, to simulate high eviction rate of in-memory 
cache, that can happen when sling application is under heavy load with big 
number of concurrent requests.
{noformat}
> java -jar target/oak-benchmarks-*-SNAPSHOT.jar benchmark --cache 0  
> GetParentNodeWithNodeAPI  GetParentNodeWithSessionAPI  Oak-Segment-Tar
Apache Jackrabbit Oak 1.37-SNAPSHOT
# GetParentNodeWithNodeAPI         C     min     10%     50%     90%     max    
 N       mean 
Oak-Segment-Tar                    1       7       8       9      10     14    
6812       9
# GetParentNodeWithSessionAP       C     min     10%     50%     90%     max    
 N       mean 
Oak-Segment-Tar                    1    2209    2210    2227    2249   2274     
 27    2230
{noformat}
In this case, parent node retrieval operation for 90% of executions is ~200 
times slower when using Session API. 

In this test, tar segment store is used, with segments persisted on a local 
SSD. When remote segment store implementation is used (Azure/AWS), test with 
Session API would be even slower because of the network roundtrips involved.

 

> Use javax.jcr.Item.getParent() when resolving parent JCR node in 
> JcrResourceProvider#getParent
> ----------------------------------------------------------------------------------------------
>
>                 Key: SLING-10011
>                 URL: https://issues.apache.org/jira/browse/SLING-10011
>             Project: Sling
>          Issue Type: Improvement
>          Components: JCR
>    Affects Versions: JCR Resource 3.0.22
>            Reporter: Miroslav Smiljanic
>            Priority: Minor
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently 
> [JcrResourceProvider.getParent|https://github.com/apache/sling-org-apache-sling-jcr-resource/blob/org.apache.sling.jcr.resource-3.0.22/src/main/java/org/apache/sling/jcr/resource/internal/helper/jcr/JcrResourceProvider.java#L361]
>  is using JcrItemResourceFactory.getItemOrNull(String path), which eventually 
> is using JCR session to retrieve parent node using absolute path.
> I propose using javax.jcr.Item.getParent() instead.
> Reasoning wold be to utilise potential improvements in JCR implementation 
> that would for a given node retrieve the whole subtree. That can be 
> configured for example by using particular node type or node path.
> {noformat}
>     root
>      |
>      a 
>    /   \
>   b     c    
> {noformat}
> If node 'a' in picture above, is matching desired configuration, then code 
> below would return the whole subtree.
> {code:java}
> Node a = jcrSession.getNode("a");
> {code}
> That further means retrieved subtree can be traversed in memory, without the 
> need to communicate with the JCR repository storage.
> (!)That is particularly important when remote (cloud) storage is used for 
> repository in JCR implementation, and tree traversal can be done without 
> doing additional network roundtrips.
> {code:java}
> //JCR tree traversal happens in memory
> Node b = a.getNode("b");
> Node c = a.getNode("c");
> {code}
> Also going from child to parent, is resolved in memory as well (proposal 
> relates to this fact)
> {code:java}
> //JCR tree traversal happens in memory
> assert b.getParent() == c.getParent();
> {code}
> Jackrabbit Oak, for document node store is supporting node bundling for 
> configured node type
>  [http://jackrabbit.apache.org/oak/docs/nodestore/document/node-bundling.html]
> Currently I am also doing some experiments to support node 
> bundling/aggregation for arbitrary node store 
> ([NodeDelegateFullyLoaded|https://github.com/smiroslav/jackrabbit-oak/blob/ppnextgen_newstore/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/delegate/NodeDelegateFullyLoaded.java],
>  
> [FullyLoadedTree|https://github.com/smiroslav/jackrabbit-oak/blob/ppnextgen_newstore/oak-core/src/main/java/org/apache/jackrabbit/oak/core/FullyLoadedTree.java]).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to