ben-manes commented on code in PR #1155:
URL: https://github.com/apache/solr/pull/1155#discussion_r1043745746


##########
solr/core/src/java/org/apache/solr/core/TransientSolrCoreCacheDefault.java:
##########
@@ -70,10 +80,15 @@ public TransientSolrCoreCacheDefault(CoreContainer 
coreContainer) {
     Caffeine<String, SolrCore> transientCoresCacheBuilder =
         Caffeine.newBuilder()
             .initialCapacity(initialCapacity)
-            // Use the current thread to queue evicted cores for closing. This 
ensures the
-            // cache max size is respected (with a different thread the max 
size would be
-            // respected asynchronously only eventually).
-            .executor(Runnable::run)
+            // Do NOT use the current thread to queue evicted cores for 
closing. Although this
+            // ensures the cache max size is respected only eventually, it 
should be a very
+            // short period of time until the cache maintenance task kicks in.
+            // The onEvict method needs the writeLock from SolrCores to 
operate safely.
+            // However, the current thread most probably has acquired a 
read-lock already
+            // somewhere up the call stack and would deadlock.
+            // Note that Caffeine cache has an internal maintenance task 
rescheduling logic which
+            // explicitly checks on the common pool.
+            .executor(ForkJoinPool.commonPool())

Review Comment:
   I am still reluctant to introduce an eviction approver as I believe it is an 
easy but wrong answer. It is perfectly pragmatic for a private api where the 
caveats are known, but for a library it is a poor api with footguns that will 
have a large blast radius.
   
   I would use a secondary cache with weak referenced values for automatic 
clean up. That looks similar to the overflow cache here, except that by using 
reference counting it is discarded by the GC. This way there are no memory 
leaks and the same instance is retrievable if any other thread holds a strong 
reference to it to avoid duplications due to races. If simpler then the second 
cache can be inclusive since it is based on what is GC reachable rather than a 
size/time bound, so juggling the exclusivity might be useless logic that does 
not save any memory. By using Map computations the interactions between caches 
can be atomic, e.g. to load from the weak cache if absent from the primary one.
   
   A caveat is that a weak value means that a listener on that cache would have 
a null value (as GC'd), so it may not be immediately compatible with 
`SolrCores.queueCoreToClose(core)`. It may be that refactoring to coordinate 
the close based on reachability could solve that and remove the complexity that 
`SolrCores` must go through to safely shutdown an instance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to