[ https://issues.apache.org/jira/browse/LUCENE-8017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220527#comment-16220527 ]
Adrien Grand commented on LUCENE-8017: -------------------------------------- There is a TODO about this issue in {{LRUQueryCache}}: {noformat} // TODO: should it be pluggable, eg. for queries that run on doc values? final IndexReader.CacheHelper cacheHelper = context.reader().getCoreCacheHelper(); {noformat} My idea was that we could add a {{CacheHelper Weight.getCacheHelper(LeafReaderContext)}} API, that would tell how a query is allowed to be cached: - {{null}} if matches should never be cached - {{context.reader().getCoreCacheHelper()}} for queries that only depend on core data-structures like phrase queries, point queries, etc. - {{context.reader().getReaderCacheHelper()}} for queries that run on doc values (or live docs, but I can't think of a use-case for looking at live docs in a query) bq. One could either use marker interfaces I thought about this at some point but it doesn't work well with compound queries, ie. what interface should ConstantScoreQuery and BooleanQuery implement? bq. The easiest solution is to just exclude the Function queries from the cache It has a similar issue I think, how can we know that a Boolean Query may not be cached, do we need to unwrap all sub queries? What about 3rd-party compound queries that we cannot introspect? bq. a cacheCost() method - the latter I quite like, as it means that different cache implementations can choose whether or not to cache in a more fine-grained manner What would this cacheCost compute? Isn't it a metric that we already have with the scorer cost? This is a bit orthogonal to this issue, but I agree it would be good to avoid caching sub clauses whose cost is more than X times the cost of the entire query in order to preserve good tail latencies. > FunctionRangeQuery and FunctionMatchQuery can pollute the QueryCache > -------------------------------------------------------------------- > > Key: LUCENE-8017 > URL: https://issues.apache.org/jira/browse/LUCENE-8017 > Project: Lucene - Core > Issue Type: Bug > Reporter: Alan Woodward > Assignee: Alan Woodward > > The QueryCache assumes that queries will return the same set of documents > when run over the same segment, independent of all other segments held by the > parent IndexSearcher. However, both FunctionRangeQuery and > FunctionMatchQuery can select hits based on score, which depend on term > statistics over the whole index, and could therefore theoretically return > different result sets on a given segment. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org