[ 
https://issues.apache.org/jira/browse/OAK-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nuno Santos updated OAK-11764:
------------------------------
    Priority: Minor  (was: Major)

> Query planner: when multiple indexes have same cost, planner should choose 
> index deterministically
> --------------------------------------------------------------------------------------------------
>
>                 Key: OAK-11764
>                 URL: https://issues.apache.org/jira/browse/OAK-11764
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>            Reporter: Nuno Santos
>            Priority: Minor
>              Labels: indexing
>
> The query planner chooses the index with the lowest cost. But if multiple 
> indexes evaluate to the same cost, currently the planner is taking the first 
> index that is evaluated with the lowest score. This is not deterministic, so 
> the same query may be evaluated sometimes by an index and other times by a 
> different index. Although the results should be the same regardless of the 
> index being used, there can be some cases where the customers are (wrongly) 
> relying on the query being evaluated by the same index every time. For 
> instance, if the query does not precisely define the order of the results, 
> and different indexes will return different orders. This does not violate the 
> contract of the query specification, but users may rely on the results being 
> always in the same order (for instance, to implement pagination).
> The indexes are evaluated in the class 
> [FullTextIndex|[https://github.com/apache/jackrabbit-oak/blob/ff1080185d21bd10c017d8a8eb94236ef99c7e87/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndex.java#L116-L133],]
>  and taken from a Set with all the indexes created in 
> [IndexLookup|[https://github.com/apache/jackrabbit-oak/blob/ff1080185d21bd10c017d8a8eb94236ef99c7e87/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexLookup.java#L55-L70].]
>  This class uses a JDK HashSet to store the indexes. The iteration order of 
> this class is changed on purpose from run to run of the JDK. Previously, this 
> class was using a Guava HashSet, which iterates always by the order by which 
> the elements are added. This 
> [PR|https://github.com/apache/jackrabbit-oak/pull/1678] changed the Guava to 
> the JDK HashSet.
> We should ensure that the indexes are selected in a deterministic order, even 
> if they have the same cost. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to