[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-28 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328629952
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +911,36 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+// @return true if synchronous caching is needed, false otherwise
+private boolean cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return false;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//Remove the key from inflight -- the key is loaded now
+Object retValue = inFlightAsyncLoadQueries.remove(in.getQuery());
+
+assert retValue != null;
 
 Review comment:
   Add a comment here about why `retValue` better not be `null`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-28 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328626689
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +911,36 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+// @return true if synchronous caching is needed, false otherwise
+private boolean cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return false;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//Remove the key from inflight -- the key is loaded now
 
 Review comment:
   Add space after `//`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-28 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r329305339
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +911,36 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+// @return true if synchronous caching is needed, false otherwise
+private boolean cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return false;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//Remove the key from inflight -- the key is loaded now
+Object retValue = inFlightAsyncLoadQueries.remove(in.getQuery());
+
+assert retValue != null;
+
+return null;
+  });
+  try {
+executor.execute(task);
+  } catch (RejectedExecutionException e) {
 
 Review comment:
   Can you updated `IndexSearcher` javadocs indicating that 1) the `Executor` 
is used for concurrent query caching, and 2) how we handle 
`RejectedExecutionException` from it (caller runs)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-25 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328208652
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +920,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) throws RejectedExecutionException {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//Remove the key from inflight -- the key is loaded now
+inFlightAsyncLoadQueries.remove(in.getQuery());
 
 Review comment:
   Can you `assert` the return value here?  I.e. we should *always* succeed in 
removing this query from the set.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-25 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328207678
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -813,8 +881,28 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  boolean performSynchronousCaching = !(executor != null);
 
 Review comment:
   Fix double negative here too?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-25 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328208798
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//remove the key from inflight -- the key is loaded now
+inFlightAsyncLoadQueries.remove(in.getQuery());
+return null;
+  });
+  executor.execute(task);
 
 Review comment:
   Yeah it is, thanks for opening separate issue ...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-25 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328206271
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +779,29 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  boolean performSynchronousCaching = !(executor != null);
 
 Review comment:
   Can you remove the double-negative here?  I.e.:
   
   ```
 boolean performSynchronousCaching = executor == null
   ```
   
   Also maybe rename to `cacheSynchronously`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-25 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328208409
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -813,8 +881,28 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  boolean performSynchronousCaching = !(executor != null);
+  // If asynchronous caching is requested, perform the same and return
+  // the uncached iterator
+  if (!performSynchronousCaching) {
+try {
+  cacheAsynchronously(context, cacheHelper);
+} catch (RejectedExecutionException e) {
 
 Review comment:
   Hmm, can you shrink-wrap this exception handling?  I.e., handle it down in 
`cacheAsynchronously` where we ask the executor to execute the task, instead of 
up here?
   
   Maybe then return a `boolean` from `cacheAsynchronously` to know (here) if 
caller should then cache synchronously ...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-25 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r328207416
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +779,29 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  boolean performSynchronousCaching = !(executor != null);
+
+  // If asynchronous caching is requested, perform the same and return
+  // the uncached iterator
+  if (!performSynchronousCaching) {
+try {
+  cacheAsynchronously(context, cacheHelper);
+} catch (RejectedExecutionException e) {
+  // Trigger synchronous caching
+  performSynchronousCaching = true;
+}
+
+// If async caching failed, synchronous caching will
+// be performed, hence do not return the uncached value
+if (!performSynchronousCaching) {
 
 Review comment:
   Change to `performSynchronousCaching == false`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-24 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327870291
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
 
 Review comment:
   Ahh OK let's do that later then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-24 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327566678
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//remove the key from inflight -- the key is loaded now
 
 Review comment:
   s/`//remove`/`// Remove`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-24 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327565812
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +774,16 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  // If asynchronous caching is requested, perform the same and return
+  // the uncached iterator
+  if (executor != null) {
+cacheAsynchronously(context, cacheHelper);
+return in.scorerSupplier(context);
+  }
+  else {
+docIdSet = cache(context);
 
 Review comment:
   I think this means async caching can be more efficient, because with single 
threaded caching, multiple threads could do the work to try the cache the same 
`Query`, with only one of them winning in the end, but with async caching, we 
ensure only one search thread does the caching?  So e.g. red-line QPS 
(capacity) could be a bit higher with async, if queries are often duplicated at 
once?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-24 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327562068
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -656,13 +691,20 @@ public long ramBytesUsed() {
 // threads when IndexSearcher is created with threads
 private final AtomicBoolean used;
 
+private Executor executor;
+
 CachingWrapperWeight(Weight in, QueryCachingPolicy policy) {
   super(in.getQuery(), 1f);
   this.in = in;
   this.policy = policy;
   used = new AtomicBoolean(false);
 }
 
+CachingWrapperWeight(Weight in, QueryCachingPolicy policy, Executor 
executor) {
+  this(in, policy);
 
 Review comment:
   Duplicate this constructor a bit and make the `executor` final?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-24 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327566083
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//remove the key from inflight -- the key is loaded now
+inFlightAsyncLoadQueries.remove(in.getQuery());
+return null;
+  });
+  executor.execute(task);
 
 Review comment:
   Hmm what if the executor rejects the task?  Should we run it synchronously 
in that case?  Or do we have an infinite blocking queue for it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-24 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r327567246
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +889,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) {
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.add(in.getQuery()) == false) {
+return;
+  }
+
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
 
 Review comment:
   Can you save the return value from `putIfAbsent` call and then `assert` it's 
`null`?  I.e. the cache should never already have this query since we checked 
up-front in thread tight manner?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-20 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326564032
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +894,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) throws IOException {
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//remove the key from inflight -- the key is loaded now
+inFlightAsyncLoadQueries.remove(in.getQuery());
+return null;
+  });
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.contains(in.getQuery())) {
 
 Review comment:
   Can you move the `task` creation to after this `if`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-20 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326565944
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -448,13 +459,42 @@ void assertConsistent() {
 }
   }
 
+  // pkg-private for testing
+  // return the list of queries being loaded asynchronously
+  List inFlightQueries() {
+lock.lock();
+try {
+  return new ArrayList<>(inFlightAsyncLoadQueries);
 
 Review comment:
   I'm still confused about `lock` -- do we always hold the lock when checking 
if query is already in the map?  If so, we don't need a `ConcurrentHashMap`?  
If not, why do we even have the `lock` since it is a `ConcurrentHashMap`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-20 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326565568
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +894,25 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+private void cacheAsynchronously(LeafReaderContext context, 
IndexReader.CacheHelper cacheHelper) throws IOException {
+  FutureTask task = new FutureTask<>(() -> {
+DocIdSet localDocIdSet = cache(context);
+putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+//remove the key from inflight -- the key is loaded now
+inFlightAsyncLoadQueries.remove(in.getQuery());
+return null;
+  });
+  /*
+   * If the current query is already being asynchronously cached,
+   * do not trigger another cache operation
+   */
+  if (inFlightAsyncLoadQueries.contains(in.getQuery())) {
+return;
+  }
+  inFlightAsyncLoadQueries.add(in.getQuery());
 
 Review comment:
   Hmm still a concurrency bug here, allowing two threads to cache the same 
query -- instead of separate `.contains` and `.add` calls, can you simply call 
`.add` and look at its return value to decide whether you should in fact cache?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-20 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r326563068
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -95,6 +98,11 @@
   // maps queries that are contained in the cache to a singleton so that this
   // cache does not store several copies of the same query
   private final Map uniqueQueries;
+  // Marks the inflight queries that are being asynchronously loaded into the 
cache
+  // This is used primarily to ensure that multiple threads do not trigger 
loading
 
 Review comment:
   Remove `primarily`?  This is its sole purpose?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325915728
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -262,6 +284,15 @@ DocIdSet get(Query key, IndexReader.CacheHelper 
cacheHelper) {
 assert lock.isHeldByCurrentThread();
 assert key instanceof BoostQuery == false;
 assert key instanceof ConstantScoreQuery == false;
+
+/*
+ * If the current query is already being asynchronously cached,
+ * do not trigger another cache operation
+ */
+if (inFlightAsyncLoadQueries.contains(key)) {
 
 Review comment:
   Hmm can we fix the concurrency issue here, e.g. just call 
`inFlightAsyncLoadQueries.add` and if the returned value is `false` (because 
the query was already in the set) then throw the exception?  Or, move this 
check down into the `cacheAsynchronously` method where we are doing the `add`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325915208
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -832,5 +924,20 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   return new DefaultBulkScorer(new ConstantScoreScorer(this, 0f, 
ScoreMode.COMPLETE_NO_SCORES, disi));
 }
 
+// Perform a cache load asynchronously
+// NOTE: Potentially, two threads can trigger a load for the same query 
concurrently as the check for presence of the query
+// done upstream and the lock is not he
 
 Review comment:
   Hmm this sentence abruptly ended?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-18 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r325844374
 
 

 ##
 File path: lucene/CHANGES.txt
 ##
 @@ -56,6 +56,9 @@ Improvements
 * LUCENE-8937: Avoid agressive stemming on numbers in the FrenchMinimalStemmer.
   (Adrien Gallou via Tomoko Uchida)
 
+* LUCENE-8213: LRUQueryCache#doCache can now use IndexSearcher's Executor(if 
present)
 
 Review comment:
   Space before `(if present`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-13 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r324322219
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -807,14 +896,35 @@ public BulkScorer bulkScorer(LeafReaderContext context) 
throws IOException {
   DocIdSet docIdSet;
   try {
 docIdSet = get(in.getQuery(), cacheHelper);
+  } catch (AsyncQueryLoadInProgressException e) {
+// Query is being loaded asynchronously. Use uncached version but
+// do not do a new load of the same query into the cache
+return in.bulkScorer(context);
   } finally {
 lock.unlock();
   }
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  // If asynchronous caching is requested, perform the same and return
+  // the uncached iterator
+  if (executor != null) {
+FutureTask task = new FutureTask<>(() -> {
+  DocIdSet localDocIdSet = cache(context);
+  putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+  //remove the key from inflight -- the key is loaded now
+  inFlightAsyncLoadQueries.remove(in.getQuery());
 
 Review comment:
   Disappointing that all this code is duplicated between `scorerSupplier` and 
`bulkScorer` methods ... can we somehow factor to share?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-13 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r324320544
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -89,12 +92,30 @@
  */
 public class LRUQueryCache implements QueryCache, Accountable {
 
+  /** Throw this error when a query is asynchronously being loaded in the cache
+   *  and a read for the same is requested
+   */
+  private static final class AsyncQueryLoadInProgressException extends 
RuntimeException {
+
+/**
+ * Sole constructor.
+ */
+public AsyncQueryLoadInProgressException(String query) {
+  super(query);
+}
+  }
+
   private final int maxSize;
   private final long maxRamBytesUsed;
   private final Predicate leavesToCache;
   // maps queries that are contained in the cache to a singleton so that this
   // cache does not store several copies of the same query
   private final Map uniqueQueries;
+  // Marks the inflight queries that are being asynchronously loaded into the 
cache
+  // This is used primarily to ensure that multiple threads do not trigger 
loading
+  // of the same query in the same cache. We use a set because it is an 
invariant that
+  // the entries of this data structure be unique.
+  private final Set inFlightAsyncLoadQueries = 
Collections.newSetFromMap(new ConcurrentHashMap());
 
 Review comment:
   It looks like we only ever access this map/set while holding the `lock`?  
Can we use an ordinary `HashMap` if so?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-13 Thread GitBox
mikemccand commented on a change in pull request #815: LUCENE-8213: Introduce 
Asynchronous Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#discussion_r324321360
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -726,14 +794,35 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
   DocIdSet docIdSet;
   try {
 docIdSet = get(in.getQuery(), cacheHelper);
+  } catch (AsyncQueryLoadInProgressException e) {
+// Query is being loaded asynchronously. Use uncached version but
+// do not do a new load of the same query into the cache
+return in.scorerSupplier(context);
   } finally {
 lock.unlock();
   }
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  // If asynchronous caching is requested, perform the same and return
+  // the uncached iterator
+  if (executor != null) {
+FutureTask task = new FutureTask<>(() -> {
+  DocIdSet localDocIdSet = cache(context);
+  putIfAbsent(in.getQuery(), localDocIdSet, cacheHelper);
+
+  //remove the key from inflight -- the key is loaded now
+  inFlightAsyncLoadQueries.remove(in.getQuery());
+  return null;
+});
+inFlightAsyncLoadQueries.add(in.getQuery());
 
 Review comment:
   I think we have a concurrency bug here -- if the same query across different 
threads is checked for caching, then two threads can get to this point, and 
both can compute the cached result and store it?  Because, this `.add` happens 
at a different time from the above check (`.contains`)...
   
   Maybe this doesn't matter?  It just means two threads do the work of 
caching, and one is wasted?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org