Hi Chetan,
For the lucene index provider we have the following configured
Enabled:
Copy on Read
Copy on Write
Open Index asynchronously
Prefetch Index Files
Thread pool is 20
Extracted text cache size is 0
Extracted text cache expiry 300
Boolean clause limit 1024
Additional notes:
The timing is coming from a backup of our prod environment that I'm using for
testing. Running the query in our production environment actually results in a
response time that is closer to an hour. When we use a query that does not use
Lucene and utilizes another index like cqMaster it is magnitudes faster, in the
range of 400 milliseconds to count.
I'm investigating the usage of ACL now.
The following is the text from the explain query
{
"statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content])
AND [cq:master] IS NOT NULL",
"language": "JCR-SQL2",
"explain": {
"logs": [
"cost using filter Filter(query=explain SELECT * FROM [nt:base]
WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL,
path=/content//*, property=[cq:master=[is not null]])",
"cost for aggregate lucene is Infinity",
"Evaluating plan with index definition Lucene Index :
cqTag(/oak:index/cqTagLucene)",
"Evaluating plan with index definition Lucene Index :
workflow(/oak:index/workflowDataLucene)",
"Evaluating plan with index definition Lucene Index :
authorizables(/oak:index/authorizables)",
"Evaluating plan with index definition Lucene Index :
/oak:index/damAssetLucene",
"Evaluating plan with index definition Lucene Index :
tags(/oak:index/ntBaseLucene)",
"Evaluating plan with index definition Lucene Index :
/oak:index/lucene",
"Evaluating plan with index definition Lucene Index :
cq:Page(/oak:index/cqPageLucene)",
"cost for lucene-property[/oak:index/lucene] is 1001.0",
"cost for reference is Infinity",
"getPlans(Filter, List<OrderEntry>, NodeState)",
"getPlans() - filter: Filter(query=explain SELECT * FROM [nt:base]
WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL,
path=/content//*, property=[cq:master=[is not null]]) - ",
"getPlans() - sortOrder: null - ",
"getPlans() - rootState: { jcr:primaryType = rep:root,
jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable],
sling:target = /index.html, sling:resourceType = sling:redirect, :async : {
async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035,
f3ec0fb4-6f32-488d-b751-be3179dd0a40], async =
fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo =
2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = {
... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... },
content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes =
[mix:versionable], jcr:createdBy = admin, jcr:versionHistory =
e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created =
2016-01-18T15:28:50.055-05:00, jcr:baseVersion =
777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid =
40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... },
rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
"cost for ordered is Infinity",
"cost for nodeType is Infinity",
"property cost for cqMaster is 1584.0",
"cost for property is 1584.0",
"cost for traverse is 1.0588708E7"
],
"plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene)
:ancestors:/content where (isdescendantnode([nt:base], [/content])) and
([nt:base].[cq:master] is not null) */"
},
"heuristics": {
"executeTime": 0,
"getNodesTime": 3279,
"count": 2949,
"countTime": 349807,
"totalTime": 353086
}
}
-Jason
-----Original Message-----
From: Chetan Mehrotra [mailto:[email protected]]
Sent: Wednesday, August 30, 2017 1:03 AM
To: [email protected]
Subject: Re: slowness accessing lucene results
EXTERNAL
The query can be slow due to multiple reasons
1. Copy-On-Write support is not enabled
2. Some of the JCR query constraints may not be getting mapped to Lucene index
i.e. index definition does not have full coverage of query constraints causing
quite a few paths to get filtered out. So actual number of rows accessed may be
large but only few of them qualify all criteria at query engine level. This can
be confirmed by query explanation output
3. Or lots of filtering happening due to access control. Does the performance
remains same with admin user?
Chetan Mehrotra
On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <[email protected]> wrote:
> Hi all,
>
> I'm tracking down a problem I'm currently experiencing when my queries use
> the lucene index on oak 1.2.18. Here are the figures that I get from the acs
> commons tool that breaks down queries.
>
> "executeTime": 0
> "getNodesTime": 3279
> "count": 2949
> "countTime": 349807
> "totalTime": 353086
>
> In essence, it's taking me about 6 minutes to iterate through less than 3000
> results. Does anyone have any ideas on root cause?
>
> -Jason