[ https://issues.apache.org/jira/browse/OAK-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julian Reschke updated OAK-7379: -------------------------------- Fix Version/s: 1.9.0 > Lucene Index: per-column selectivity, assume 5 unique entries > ------------------------------------------------------------- > > Key: OAK-7379 > URL: https://issues.apache.org/jira/browse/OAK-7379 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene, query > Reporter: Thomas Mueller > Assignee: Thomas Mueller > Priority: Major > Labels: candidate_oak_1_8 > Fix For: 1.9.0, 1.10 > > > Currently, if a query has a property restriction of the form "property = x", > and the property is indexed in a Lucene property index, the estimated cost is > the index is the number of documents indexed for that property. This is a > very conservative estimate, it means all documents have the same value. So > the cost is relatively high for that index. > In almost all cases, there are many distinct values for a property. Rarely > there are few values, or a skewed distribution where one value contains most > documents. But in almost all cases there are more than 5 distinct values. > I think it makes sense to use 5 as the default value. It is still > conservative (cost of the index is high), but much better than now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)