[ 
https://issues.apache.org/jira/browse/JCR-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Huber updated JCR-2835:
-----------------------------

    Affects Version/s: 2.3.0
                       2.2.1
               Status: Patch Available  (was: Open)

I am attaching a first pass at the descendant search tests. 
These tests were performed on the trunk WITHOUT the proposed patch. I will work 
on implementing Jukka's proposal now that I have the tests.

Please review the XPath one as I am not that fluent in those queries. 

The current difference is huge (provided my tests are correct) : 

XPath : 
# DescendantSearchTest                   min     10%     50%     90%     max
2.2                                       25      34      43      59     265

SQL-2 : 

# SQL2DescendantSearchTest               min     10%     50%     90%     max
2.2                                   395318  395318  395318  395318  395318

If the test implementations look ok, I can commit them once reviewed. 

Best regards,
   Serge Huber.

> Poor performance of ISDESCENDANTNODE on SQL 2 queries
> -----------------------------------------------------
>
>                 Key: JCR-2835
>                 URL: https://issues.apache.org/jira/browse/JCR-2835
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>    Affects Versions: 2.2.0, 2.2.1, 2.3.0
>            Reporter: Serge Huber
>            Assignee: Serge Huber
>             Fix For: 2.3.0
>
>         Attachments: 
> JCR-2835_Poor_performance_on_ISDESCENDANTNODE_constraint_v1.patch
>
>
> Using the latest source code, I have noticed very bad performance on SQL-2 
> queries that use the ISDESCENDANTNODE constraint on a large sub-tree. For 
> example, the query : 
> select * from [jnt:news] as news where ISDESCENDANTNODE(news,'/root/site') 
> order by news.[date] desc 
> executes in 600ms 
> select * from [jnt:news] as news order by news.[date] desc
> executes in 4ms
> From looking at the problem in the Yourkit profiler, it seems that the 
> culprit is the constraint building, that uses recursive Lucene searches to 
> build the list of descendant node IDs : 
>     private Query getDescendantNodeQuery(
>             DescendantNode dn, JackrabbitIndexSearcher searcher)
>             throws RepositoryException, IOException {
>         BooleanQuery query = new BooleanQuery();
>         try {
>             LinkedList<NodeId> ids = new LinkedList<NodeId>();
>             NodeImpl ancestor = (NodeImpl) 
> session.getNode(dn.getAncestorPath());
>             ids.add(ancestor.getNodeId());
>             while (!ids.isEmpty()) {
>                 String id = ids.removeFirst().toString();
>                 Query q = new JackrabbitTermQuery(new Term(FieldNames.PARENT, 
> id));
>                 QueryHits hits = searcher.evaluate(q);
>                 ScoreNode sn = hits.nextScoreNode();
>                 if (sn != null) {
>                     query.add(q, SHOULD);
>                     do {
>                         ids.add(sn.getNodeId());
>                         sn = hits.nextScoreNode();
>                     } while (sn != null);
>                 }
>             }
>         } catch (PathNotFoundException e) {
>             query.add(new JackrabbitTermQuery(new Term(
>                     FieldNames.UUID, "invalid-node-id")), // never matches
>                     SHOULD);
>         }
>         return query;
>     }
> In the above example this generates over 2800 Lucene queries, which is the 
> culprit. I wonder if it wouldn't be faster to retrieve the IDs by using the 
> JCR to retrieve the list of child IDs ?
> This was probably also missed because I didn't seem to find any performance 
> tests on this constraint.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to