[ https://issues.apache.org/jira/browse/OAK-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Mueller updated OAK-4323: -------------------------------- Fix Version/s: 1.6 > Query engine: index cost formula incorrect when using "limit" > ------------------------------------------------------------- > > Key: OAK-4323 > URL: https://issues.apache.org/jira/browse/OAK-4323 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query > Reporter: Thomas Mueller > Fix For: 1.6 > > > As described in OAK-2081, the cost formula currently used in the query engine > is not correct if "limit" is used, because it doesn't account for false > positives. > Example: Let's say there are two indexes: > * color: 10000 nodes with color=red, but a bit slower (lets say a remote > index), cost per entry is 1.5. > * size: 20000 nodes with size=M, but a bit faster (lets say a local index), > cost per entry is 1. > Without limit, the index for "color" should be used as 10000 * 1.5 = 15000 is > lower than 20000 * 1 = 20000. > With limit=100, then we could calculate as follows: there are at most 10000 > entries (according to index "color"), so the false positive rate of the > "size" index is at least 50%. So cost of "color" is 100 * 1.5 = 150. Cost of > "size" is 100 * 1 = 100, but with false positive rate of 50%, so cost is > actually 200. Therefor, still the index "color" should be used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)