Hi Chetan,

For the lucene index provider we have the following configured
Enabled:
  Copy on Read
  Copy on Write
  Open Index asynchronously
  Prefetch Index Files

Thread pool is 20
Extracted text cache size is 0
Extracted text cache expiry 300
Boolean clause limit 1024

Additional notes:
The timing is coming from a backup of our prod environment that I'm using for 
testing. Running the query in our production environment actually results in a 
response time that is closer to an hour. When we use a query that does not use 
Lucene and utilizes another index like cqMaster it is magnitudes faster, in the 
range of 400 milliseconds to count. 

I'm investigating the usage of ACL now.

The following is the text from the explain query

{
    "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) 
AND [cq:master] IS NOT NULL",
    "language": "JCR-SQL2",
    "explain": {
        "logs": [
            "cost using filter Filter(query=explain SELECT * FROM [nt:base] 
WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, 
path=/content//*, property=[cq:master=[is not null]])",
            "cost for aggregate lucene is Infinity",
            "Evaluating plan with index definition Lucene Index : 
cqTag(/oak:index/cqTagLucene)",
            "Evaluating plan with index definition Lucene Index : 
workflow(/oak:index/workflowDataLucene)",
            "Evaluating plan with index definition Lucene Index : 
authorizables(/oak:index/authorizables)",
            "Evaluating plan with index definition Lucene Index : 
/oak:index/damAssetLucene",
            "Evaluating plan with index definition Lucene Index : 
tags(/oak:index/ntBaseLucene)",
            "Evaluating plan with index definition Lucene Index : 
/oak:index/lucene",
            "Evaluating plan with index definition Lucene Index : 
cq:Page(/oak:index/cqPageLucene)",
            "cost for lucene-property[/oak:index/lucene] is 1001.0",
            "cost for reference is Infinity",
            "getPlans(Filter, List<OrderEntry>, NodeState)",
            "getPlans() - filter: Filter(query=explain SELECT * FROM [nt:base] 
WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, 
path=/content//*, property=[cq:master=[is not null]]) - ",
            "getPlans() - sortOrder: null - ",
            "getPlans() - rootState: { jcr:primaryType = rep:root, 
jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], 
sling:target = /index.html, sling:resourceType = sling:redirect, :async : { 
async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, 
f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = 
fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 
2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = { 
... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... }, 
content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = 
[mix:versionable], jcr:createdBy = admin, jcr:versionHistory = 
e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 
2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 
777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 
40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... }, 
rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
            "cost for ordered is Infinity",
            "cost for nodeType is Infinity",
            "property cost for cqMaster is 1584.0",
            "cost for property is 1584.0",
            "cost for traverse is 1.0588708E7"
        ],
        "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) 
:ancestors:/content where (isdescendantnode([nt:base], [/content])) and 
([nt:base].[cq:master] is not null) */"
    },
    "heuristics": {
        "executeTime": 0,
        "getNodesTime": 3279,
        "count": 2949,
        "countTime": 349807,
        "totalTime": 353086
    }
}

-Jason

-----Original Message-----
From: Chetan Mehrotra [mailto:[email protected]] 
Sent: Wednesday, August 30, 2017 1:03 AM
To: [email protected]
Subject: Re: slowness accessing lucene results

EXTERNAL

The query can be slow due to multiple reasons

1. Copy-On-Write support is not enabled

2. Some of the JCR query constraints may not be getting mapped to Lucene index 
i.e. index definition does not have full coverage of query constraints causing 
quite a few paths to get filtered out. So actual number of rows accessed may be 
large but only few of them qualify all criteria at query engine level. This can 
be confirmed by query explanation output

3. Or lots of filtering happening due to access control. Does the performance 
remains same with admin user?

Chetan Mehrotra


On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <[email protected]> wrote:
> Hi all,
>
> I'm tracking down a problem I'm currently experiencing when my queries use 
> the lucene index on oak 1.2.18.  Here are the figures that I get from the acs 
> commons tool that breaks down queries.
>
> "executeTime": 0
> "getNodesTime": 3279
> "count": 2949
> "countTime": 349807
> "totalTime": 353086
>
> In essence, it's taking me about 6 minutes to iterate through less than 3000 
> results. Does anyone have any ideas on root cause?
>
> -Jason

Reply via email to