jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333896983
########## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ########## @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + final ScorerSupplier supplier = in.scorerSupplier(context); + if (supplier == null) { + putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper); + return null; + } + + final long cost = supplier.cost(); + return new ScorerSupplier() { + @Override + public Scorer get(long leadCost) throws IOException { + // skip cache operation which would slow query down too much + if ((cost > skipCacheCost || cost > leadCost * skipCacheFactor) + && in.getQuery() instanceof IndexOrDocValuesQuery) { Review comment: This PR is mainly for IndexOrDocValuesQuery now. As discussed earlier, the reason why IndexOrDocValuesQuery slow down is that a large amount of data will be read during caching action, while only a small amount of data will be read from doc values when not caching. I don't find any other type of query that reads much more data for caching than it really needs. @jpountz Looking forward to more discussions if you this PR should apply to all query types. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org