[ https://issues.apache.org/jira/browse/OAK-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Mueller updated OAK-4816: -------------------------------- Component/s: query > Property index: cost estimate with path restriction is too optimistic > --------------------------------------------------------------------- > > Key: OAK-4816 > URL: https://issues.apache.org/jira/browse/OAK-4816 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query > Reporter: Thomas Mueller > Assignee: Thomas Mueller > Fix For: 1.6 > > > The property index cost estimation is too optimistic in case there is a > property restriction plus a path restriction. The current algorithm, as > documented in > http://jackrabbit.apache.org/oak/docs/query/property-index.html#Cost_Estimation > , assumes that matching entries are evenly distributed over the whole > repository. In many cases, this is not the case. In extreme cases, _all_ > entries that match the property restriction are in the subtree that matches > the path restriction. Example: > * 10'000 nodes with property color "red". > * 1 million nodes in the repository > * 10'000 nodes in the subtree /content > * query {{/jcr:root/content//\*[@color = 'red']}} > Currently, the cost estimate is about 100, there are about 10'000 entries for > "red", and "/content" contains 1% of all nodes. But in reality, there might > be 10'000 entries with color "red" in that subtree (that is, all of them). > The cost estimation should take that into account, and assume that at least > 80% of the matching nodes are in that subtree (if the subtree contains that > many nodes). -- This message was sent by Atlassian JIRA (v6.3.4#6332)