Nuno Santos created OAK-11764:
---------------------------------
Summary: Query planner: when multiple indexes have same cost,
planner should choose index deterministically
Key: OAK-11764
URL: https://issues.apache.org/jira/browse/OAK-11764
Project: Jackrabbit Oak
Issue Type: Bug
Components: indexing
Reporter: Nuno Santos
The query planner chooses the index with the lowest cost. But if multiple
indexes evaluate to the same cost, currently the planner is taking the first
index that is evaluated with the lowest score. This is not deterministic, so
the same query may be evaluated sometimes by an index and other times by a
different index. Although the results should be the same regardless of the
index being used, there can be some cases where the customers are (wrongly)
relying on the query being evaluated by the same index every time. For
instance, if the query does not precisely define the order of the results, and
different indexes will return different orders. This does not violate the
contract of the query specification, but users may rely on the results being
always in the same order (for instance, to implement pagination).
The indexes are evaluated in the class
[FullTextIndex|[https://github.com/apache/jackrabbit-oak/blob/ff1080185d21bd10c017d8a8eb94236ef99c7e87/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndex.java#L116-L133],]
and taken from a Set with all the indexes created in
[IndexLookup|[https://github.com/apache/jackrabbit-oak/blob/ff1080185d21bd10c017d8a8eb94236ef99c7e87/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexLookup.java#L55-L70].]
This class uses a JDK HashSet to store the indexes. The iteration order of
this class is changed on purpose from run to run of the JDK. Previously, this
class was using a Guava HashSet, which iterates always by the order by which
the elements are added. This
[PR|https://github.com/apache/jackrabbit-oak/pull/1678] changed the Guava to
the JDK HashSet.
We should ensure that the indexes are selected in a deterministic order, even
if they have the same cost.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)