[ 
https://issues.apache.org/jira/browse/OAK-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-4816:
--------------------------------
    Component/s: query

> Property index: cost estimate with path restriction is too optimistic
> ---------------------------------------------------------------------
>
>                 Key: OAK-4816
>                 URL: https://issues.apache.org/jira/browse/OAK-4816
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 1.6
>
>
> The property index cost estimation is too optimistic in case there is a 
> property restriction plus a path restriction. The current algorithm, as 
> documented in 
> http://jackrabbit.apache.org/oak/docs/query/property-index.html#Cost_Estimation
>  , assumes that matching entries are evenly distributed over the whole 
> repository. In many cases, this is not the case. In extreme cases, _all_ 
> entries that match the property restriction are in the subtree that matches 
> the path restriction. Example: 
> * 10'000 nodes with property color "red".
> * 1 million nodes in the repository
> * 10'000 nodes in the subtree /content
> * query {{/jcr:root/content//\*[@color = 'red']}}
> Currently, the cost estimate is about 100, there are about 10'000 entries for 
> "red", and "/content" contains 1% of all nodes. But in reality, there might 
> be 10'000 entries with color "red" in that subtree (that is, all of them).
> The cost estimation should take that into account, and assume that at least 
> 80% of the matching nodes are in that subtree (if the subtree contains that 
> many nodes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to