[ https://issues.apache.org/jira/browse/OAK-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Hasler updated OAK-3219: ------------------------------- Fix Version/s: (was: 1.6) 1.8 > Lucene IndexPlanner should also account for number of property constraints > evaluated while giving cost estimation > ----------------------------------------------------------------------------------------------------------------- > > Key: OAK-3219 > URL: https://issues.apache.org/jira/browse/OAK-3219 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene > Reporter: Chetan Mehrotra > Assignee: Thomas Mueller > Priority: Minor > Labels: performance > Fix For: 1.8 > > > Currently the cost returned by Lucene index is a function of number of > indexed documents present in the index. If the number of indexed entries are > high then it might reduce chances of this index getting selected if some > property index also support of the property constraint. > {noformat} > /jcr:root/content/freestyle-cms/customers//element(*, cq:Page) > [(jcr:content/@title = 'm' or jcr:like(jcr:content/@title, 'm%')) > and jcr:content/@sling:resourceType = '/components/page/customer’] > {noformat} > Consider above query with following index definition > * A property index on resourceType > * A Lucene index for cq:Page with properties {{jcr:content/title}}, > {{jcr:content/sling:resourceType}} indexed and also path restriction > evaluation enabled > Now what the two indexes can help in > # Property index > ## Path restriction > ## Property restriction on {{sling:resourceType}} > # Lucene index > ## NodeType restriction > ## Property restriction on {{sling:resourceType}} > ## Property restriction on {{title}} > ## Path restriction > Now cost estimate currently works like this > * Property index - {{f(indexedValueEstimate, estimateOfNodesUnderGivenPath)}} > ** indexedValueEstimate - For 'sling:resourceType=foo' its the approximate > count for nodes having that as 'foo' > ** estimateOfNodesUnderGivenPath - Its derived from an approximate estimation > of nodes present under given path > * Lucene Index - {{f(totalIndexedEntries)}} > As cost of Lucene is too simple it does not reflect the reality. Following 2 > changes can be done to make it better > * Given that Lucene index can handle multiple constraints compared (4) to > property index (2), the cost estimate returned by it should also reflect this > state. This can be done by setting costPerEntry to 1/(no of property > restriction evaluated) > * Get the count for queried property value - This is similar to what > PropertyIndex does and assumes that Lucene can provide that information in > O(1) cost. In case of multiple supported property restriction this can be > minima of all -- This message was sent by Atlassian JIRA (v6.3.4#6332)