[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328629952 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +911,36 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +// @return true if synchronous caching is needed, false otherwise +private boolean cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return false; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//Remove the key from inflight -- the key is loaded now +Object retValue = inFlightAsyncLoadQueries.remove(in.getQuery()); + +assert retValue != null; Review comment: Add a comment here about why `retValue` better not be `null`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328626689 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +911,36 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +// @return true if synchronous caching is needed, false otherwise +private boolean cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return false; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//Remove the key from inflight -- the key is loaded now Review comment: Add space after `//`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r329305339 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +911,36 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +// @return true if synchronous caching is needed, false otherwise +private boolean cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return false; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//Remove the key from inflight -- the key is loaded now +Object retValue = inFlightAsyncLoadQueries.remove(in.getQuery()); + +assert retValue != null; + +return null; + }); + try { +executor.execute(task); + } catch (RejectedExecutionException e) { Review comment: Can you updated `IndexSearcher` javadocs indicating that 1) the `Executor` is used for concurrent query caching, and 2) how we handle `RejectedExecutionException` from it (caller runs)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328208652 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +920,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) throws RejectedExecutionException { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//Remove the key from inflight -- the key is loaded now +inFlightAsyncLoadQueries.remove(in.getQuery()); Review comment: Can you `assert` the return value here? I.e. we should *always* succeed in removing this query from the set. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328207678 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -813,8 +881,28 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + boolean performSynchronousCaching = !(executor != null); Review comment: Fix double negative here too? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328208798 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//remove the key from inflight -- the key is loaded now +inFlightAsyncLoadQueries.remove(in.getQuery()); +return null; + }); + executor.execute(task); Review comment: Yeah it is, thanks for opening separate issue ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328206271 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -732,8 +779,29 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + boolean performSynchronousCaching = !(executor != null); Review comment: Can you remove the double-negative here? I.e.: ``` boolean performSynchronousCaching = executor == null ``` Also maybe rename to `cacheSynchronously`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328208409 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -813,8 +881,28 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + boolean performSynchronousCaching = !(executor != null); + // If asynchronous caching is requested, perform the same and return + // the uncached iterator + if (!performSynchronousCaching) { +try { + cacheAsynchronously(context, cacheHelper); +} catch (RejectedExecutionException e) { Review comment: Hmm, can you shrink-wrap this exception handling? I.e., handle it down in `cacheAsynchronously` where we ask the executor to execute the task, instead of up here? Maybe then return a `boolean` from `cacheAsynchronously` to know (here) if caller should then cache synchronously ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328207416 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -732,8 +779,29 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + boolean performSynchronousCaching = !(executor != null); + + // If asynchronous caching is requested, perform the same and return + // the uncached iterator + if (!performSynchronousCaching) { +try { + cacheAsynchronously(context, cacheHelper); +} catch (RejectedExecutionException e) { + // Trigger synchronous caching + performSynchronousCaching = true; +} + +// If async caching failed, synchronous caching will +// be performed, hence do not return the uncached value +if (!performSynchronousCaching) { Review comment: Change to `performSynchronousCaching == false`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327870291 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); Review comment: Ahh OK let's do that later then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327566678 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//remove the key from inflight -- the key is loaded now Review comment: s/`//remove`/`// Remove`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327565812 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -732,8 +774,16 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + // If asynchronous caching is requested, perform the same and return + // the uncached iterator + if (executor != null) { +cacheAsynchronously(context, cacheHelper); +return in.scorerSupplier(context); + } + else { +docIdSet = cache(context); Review comment: I think this means async caching can be more efficient, because with single threaded caching, multiple threads could do the work to try the cache the same `Query`, with only one of them winning in the end, but with async caching, we ensure only one search thread does the caching? So e.g. red-line QPS (capacity) could be a bit higher with async, if queries are often duplicated at once? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327562068 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -656,13 +691,20 @@ public long ramBytesUsed() { // threads when IndexSearcher is created with threads private final AtomicBoolean used; +private Executor executor; + CachingWrapperWeight(Weight in, QueryCachingPolicy policy) { super(in.getQuery(), 1f); this.in = in; this.policy = policy; used = new AtomicBoolean(false); } +CachingWrapperWeight(Weight in, QueryCachingPolicy policy, Executor executor) { + this(in, policy); Review comment: Duplicate this constructor a bit and make the `executor` final? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327566083 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//remove the key from inflight -- the key is loaded now +inFlightAsyncLoadQueries.remove(in.getQuery()); +return null; + }); + executor.execute(task); Review comment: Hmm what if the executor rejects the task? Should we run it synchronously in that case? Or do we have an infinite blocking queue for it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327567246 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) { + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) { +return; + } + + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); Review comment: Can you save the return value from `putIfAbsent` call and then `assert` it's `null`? I.e. the cache should never already have this query since we checked up-front in thread tight manner? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326564032 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +894,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) throws IOException { + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//remove the key from inflight -- the key is loaded now +inFlightAsyncLoadQueries.remove(in.getQuery()); +return null; + }); + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.contains(in.getQuery())) { Review comment: Can you move the `task` creation to after this `if`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326565944 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -448,13 +459,42 @@ void assertConsistent() { } } + // pkg-private for testing + // return the list of queries being loaded asynchronously + List inFlightQueries() { +lock.lock(); +try { + return new ArrayList<>(inFlightAsyncLoadQueries); Review comment: I'm still confused about `lock` -- do we always hold the lock when checking if query is already in the map? If so, we don't need a `ConcurrentHashMap`? If not, why do we even have the `lock` since it is a `ConcurrentHashMap`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326565568 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +894,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +private void cacheAsynchronously(LeafReaderContext context, IndexReader.CacheHelper cacheHelper) throws IOException { + FutureTask task = new FutureTask<>(() -> { +DocIdSet localDocIdSet = cache(context); +putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + +//remove the key from inflight -- the key is loaded now +inFlightAsyncLoadQueries.remove(in.getQuery()); +return null; + }); + /* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ + if (inFlightAsyncLoadQueries.contains(in.getQuery())) { +return; + } + inFlightAsyncLoadQueries.add(in.getQuery()); Review comment: Hmm still a concurrency bug here, allowing two threads to cache the same query -- instead of separate `.contains` and `.add` calls, can you simply call `.add` and look at its return value to decide whether you should in fact cache? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326563068 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -95,6 +98,11 @@ // maps queries that are contained in the cache to a singleton so that this // cache does not store several copies of the same query private final Map uniqueQueries; + // Marks the inflight queries that are being asynchronously loaded into the cache + // This is used primarily to ensure that multiple threads do not trigger loading Review comment: Remove `primarily`? This is its sole purpose? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325915728 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -262,6 +284,15 @@ DocIdSet get(Query key, IndexReader.CacheHelper cacheHelper) { assert lock.isHeldByCurrentThread(); assert key instanceof BoostQuery == false; assert key instanceof ConstantScoreQuery == false; + +/* + * If the current query is already being asynchronously cached, + * do not trigger another cache operation + */ +if (inFlightAsyncLoadQueries.contains(key)) { Review comment: Hmm can we fix the concurrency issue here, e.g. just call `inFlightAsyncLoadQueries.add` and if the returned value is `false` (because the query was already in the set) then throw the exception? Or, move this check down into the `cacheAsynchronously` method where we are doing the `add`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325915208 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -832,5 +924,20 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi)); } +// Perform a cache load asynchronously +// NOTE: Potentially, two threads can trigger a load for the same query concurrently as the check for presence of the query +// done upstream and the lock is not he Review comment: Hmm this sentence abruptly ended? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325844374 ## File path: lucene/CHANGES.txt ## @@ -56,6 +56,9 @@ Improvements * LUCENE-8937: Avoid agressive stemming on numbers in the FrenchMinimalStemmer. (Adrien Gallou via Tomoko Uchida) +* LUCENE-8213: LRUQueryCache#doCache can now use IndexSearcher's Executor(if present) Review comment: Space before `(if present`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r324322219 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -807,14 +896,35 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException { DocIdSet docIdSet; try { docIdSet = get(in.getQuery(), cacheHelper); + } catch (AsyncQueryLoadInProgressException e) { +// Query is being loaded asynchronously. Use uncached version but +// do not do a new load of the same query into the cache +return in.bulkScorer(context); } finally { lock.unlock(); } if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + // If asynchronous caching is requested, perform the same and return + // the uncached iterator + if (executor != null) { +FutureTask task = new FutureTask<>(() -> { + DocIdSet localDocIdSet = cache(context); + putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + + //remove the key from inflight -- the key is loaded now + inFlightAsyncLoadQueries.remove(in.getQuery()); Review comment: Disappointing that all this code is duplicated between `scorerSupplier` and `bulkScorer` methods ... can we somehow factor to share? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r324320544 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -89,12 +92,30 @@ */ public class LRUQueryCache implements QueryCache, Accountable { + /** Throw this error when a query is asynchronously being loaded in the cache + * and a read for the same is requested + */ + private static final class AsyncQueryLoadInProgressException extends RuntimeException { + +/** + * Sole constructor. + */ +public AsyncQueryLoadInProgressException(String query) { + super(query); +} + } + private final int maxSize; private final long maxRamBytesUsed; private final Predicate leavesToCache; // maps queries that are contained in the cache to a singleton so that this // cache does not store several copies of the same query private final Map uniqueQueries; + // Marks the inflight queries that are being asynchronously loaded into the cache + // This is used primarily to ensure that multiple threads do not trigger loading + // of the same query in the same cache. We use a set because it is an invariant that + // the entries of this data structure be unique. + private final Set inFlightAsyncLoadQueries = Collections.newSetFromMap(new ConcurrentHashMap()); Review comment: It looks like we only ever access this map/set while holding the `lock`? Can we use an ordinary `HashMap` if so? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/815#discussion_r324321360 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -726,14 +794,35 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti DocIdSet docIdSet; try { docIdSet = get(in.getQuery(), cacheHelper); + } catch (AsyncQueryLoadInProgressException e) { +// Query is being loaded asynchronously. Use uncached version but +// do not do a new load of the same query into the cache +return in.scorerSupplier(context); } finally { lock.unlock(); } if (docIdSet == null) { if (policy.shouldCache(in.getQuery())) { - docIdSet = cache(context); - putIfAbsent(in.getQuery(), docIdSet, cacheHelper); + // If asynchronous caching is requested, perform the same and return + // the uncached iterator + if (executor != null) { +FutureTask task = new FutureTask<>(() -> { + DocIdSet localDocIdSet = cache(context); + putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper); + + //remove the key from inflight -- the key is loaded now + inFlightAsyncLoadQueries.remove(in.getQuery()); + return null; +}); +inFlightAsyncLoadQueries.add(in.getQuery()); Review comment: I think we have a concurrency bug here -- if the same query across different threads is checked for caching, then two threads can get to this point, and both can compute the cached result and store it? Because, this `.add` happens at a different time from the above check (`.contains`)... Maybe this doesn't matter? It just means two threads do the work of caching, and one is wasted? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org