Joerg Hoh created SLING-12473:
---------------------------------

             Summary: Lock contention in ScriptDependencyResolver
                 Key: SLING-12473
                 URL: https://issues.apache.org/jira/browse/SLING-12473
             Project: Sling
          Issue Type: Improvement
          Components: HTL
    Affects Versions: Scripting HTL Engine 1.4.24-1.4.0
            Reporter: Joerg Hoh


This is a follow-up of SLING-12344.

Even with the improvements added by SLING-12344 I see these stacktraces, 
especially when an instance is just starting up.
{noformat}
 at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
        - parking to wait for  <0x0000000469fbac10> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
        at 
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt([email protected]/AbstractQueuedSynchronizer.java:885)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared([email protected]/AbstractQueuedSynchronizer.java:1009)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared([email protected]/AbstractQueuedSynchronizer.java:1324)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock([email protected]/ReentrantReadWriteLock.java:738)
        at 
org.apache.sling.scripting.sightly.impl.utils.ScriptDependencyResolver.resolveScript(ScriptDependencyResolver.java:127)
        at 
org.apache.sling.scripting.sightly.impl.engine.extension.use.RenderUnitProvider.provide(RenderUnitProvider.java:95)
        at 
org.apache.sling.scripting.sightly.impl.engine.extension.use.UseRuntimeExtension.call(UseRuntimeExtension.java:71)
        at 
org.apache.sling.scripting.sightly.impl.engine.runtime.RenderContextImpl.call(RenderContextImpl.java:72)
{noformat}

It seems to me that the current code already acquires a read lock when entering 
the method. And that whenever one thread holds the write lock, all threads 
invoking this method blocked until the write lock is released. And this happens 
even for requests which would get a cache hit.
For that reason as long as entries are added to this cache at a high frequency, 
threads invoking ScriptDependencyResolver.resolveScript() have a high chance of 
being blocked by this.

Possible mitigations:
* Disable the caching by setting the ScriptResolutionCacheSize in the HTL 
Engine to a value less than 1024; this can be used as workaround.
* refactor the code, so that cache hits can be served without acquiring the 
read lock.
* refactor the code to use a ConcurrentHashMap (as [~cziegeler] already 
suggested in the context SLING-12344, 
[Link|https://github.com/apache/sling-org-apache-sling-scripting-sightly/pull/26#issuecomment-2209407602])


Note: SLING-12471 is unrelated to this specific problem!




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to