Yes, that's how I patched the problem. You've answered my question by 
explaining that the index value is not the correct value and that issue is 
being addressed.

Thank you.
Jason 

-----Original Message-----
From: Chetan Mehrotra [mailto:[email protected]] 
Sent: Wednesday, August 30, 2017 10:56 AM
To: [email protected]
Subject: Re: slowness accessing lucene results

EXTERNAL

> "cost for lucene-property[/oak:index/lucene] is 1001.0", "property 
> cost for cqMaster is 1584.0",

Problem here is global fultext index /oak:index/lucene is getting picked up 
instead of cqMaster. This has been fixed with OAK-6333 (not backported to 1.2 
yet). As a workaround you can set 'entryCount'
property to some high value such that this lucene index is not preferred Chetan 
Mehrotra


On Wed, Aug 30, 2017 at 7:05 PM, Jason Bailey <[email protected]> wrote:
> Hi Chetan,
>
> For the lucene index provider we have the following configured
> Enabled:
>   Copy on Read
>   Copy on Write
>   Open Index asynchronously
>   Prefetch Index Files
>
> Thread pool is 20
> Extracted text cache size is 0
> Extracted text cache expiry 300
> Boolean clause limit 1024
>
> Additional notes:
> The timing is coming from a backup of our prod environment that I'm using for 
> testing. Running the query in our production environment actually results in 
> a response time that is closer to an hour. When we use a query that does not 
> use Lucene and utilizes another index like cqMaster it is magnitudes faster, 
> in the range of 400 milliseconds to count.
>
> I'm investigating the usage of ACL now.
>
> The following is the text from the explain query
>
> {
>     "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) 
> AND [cq:master] IS NOT NULL",
>     "language": "JCR-SQL2",
>     "explain": {
>         "logs": [
>             "cost using filter Filter(query=explain SELECT * FROM [nt:base] 
> WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, 
> path=/content//*, property=[cq:master=[is not null]])",
>             "cost for aggregate lucene is Infinity",
>             "Evaluating plan with index definition Lucene Index : 
> cqTag(/oak:index/cqTagLucene)",
>             "Evaluating plan with index definition Lucene Index : 
> workflow(/oak:index/workflowDataLucene)",
>             "Evaluating plan with index definition Lucene Index : 
> authorizables(/oak:index/authorizables)",
>             "Evaluating plan with index definition Lucene Index : 
> /oak:index/damAssetLucene",
>             "Evaluating plan with index definition Lucene Index : 
> tags(/oak:index/ntBaseLucene)",
>             "Evaluating plan with index definition Lucene Index : 
> /oak:index/lucene",
>             "Evaluating plan with index definition Lucene Index : 
> cq:Page(/oak:index/cqPageLucene)",
>             "cost for lucene-property[/oak:index/lucene] is 1001.0",
>             "cost for reference is Infinity",
>             "getPlans(Filter, List<OrderEntry>, NodeState)",
>             "getPlans() - filter: Filter(query=explain SELECT * FROM 
> [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, 
> path=/content//*, property=[cq:master=[is not null]]) - ",
>             "getPlans() - sortOrder: null - ",
>             "getPlans() - rootState: { jcr:primaryType = rep:root, 
> jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], 
> sling:target = /index.html, sling:resourceType = sling:redirect, :async : { 
> async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, 
> f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = 
> fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 
> 2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = { 
> ... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... }, 
> content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = 
> [mix:versionable], jcr:createdBy = admin, jcr:versionHistory = 
> e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 
> 2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 
> 777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 
> 40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... }, 
> rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
>             "cost for ordered is Infinity",
>             "cost for nodeType is Infinity",
>             "property cost for cqMaster is 1584.0",
>             "cost for property is 1584.0",
>             "cost for traverse is 1.0588708E7"
>         ],
>         "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) 
> :ancestors:/content where (isdescendantnode([nt:base], [/content])) and 
> ([nt:base].[cq:master] is not null) */"
>     },
>     "heuristics": {
>         "executeTime": 0,
>         "getNodesTime": 3279,
>         "count": 2949,
>         "countTime": 349807,
>         "totalTime": 353086
>     }
> }
>
> -Jason
>
> -----Original Message-----
> From: Chetan Mehrotra [mailto:[email protected]]
> Sent: Wednesday, August 30, 2017 1:03 AM
> To: [email protected]
> Subject: Re: slowness accessing lucene results
>
> EXTERNAL
>
> The query can be slow due to multiple reasons
>
> 1. Copy-On-Write support is not enabled
>
> 2. Some of the JCR query constraints may not be getting mapped to 
> Lucene index i.e. index definition does not have full coverage of 
> query constraints causing quite a few paths to get filtered out. So 
> actual number of rows accessed may be large but only few of them 
> qualify all criteria at query engine level. This can be confirmed by 
> query explanation output
>
> 3. Or lots of filtering happening due to access control. Does the performance 
> remains same with admin user?
>
> Chetan Mehrotra
>
>
> On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <[email protected]> wrote:
>> Hi all,
>>
>> I'm tracking down a problem I'm currently experiencing when my queries use 
>> the lucene index on oak 1.2.18.  Here are the figures that I get from the 
>> acs commons tool that breaks down queries.
>>
>> "executeTime": 0
>> "getNodesTime": 3279
>> "count": 2949
>> "countTime": 349807
>> "totalTime": 353086
>>
>> In essence, it's taking me about 6 minutes to iterate through less than 3000 
>> results. Does anyone have any ideas on root cause?
>>
>> -Jason

Reply via email to