[ https://issues.apache.org/jira/browse/SLING-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carsten Ziegeler reassigned SLING-2719: --------------------------------------- Assignee: Carsten Ziegeler > Deadlock in ResourceResolverFactoryActivator.checkFactoryPreconditions > ---------------------------------------------------------------------- > > Key: SLING-2719 > URL: https://issues.apache.org/jira/browse/SLING-2719 > Project: Sling > Issue Type: Bug > Components: ResourceResolver > Affects Versions: Resource Resolver 1.0.2 > Environment: JBoss > Reporter: Chetan Mehrotra > Assignee: Carsten Ziegeler > Labels: deadlock > Attachments: error-log-threaddump.zip > > > We are seeing intermittent issues of deadlock while running a Sling based > webapp in an app server like JBoss. The deadlock is being seen between the > FelixFrameworkWiring and FelixStartLevel threads. > For example analyzing the order of locks taken in the threaddump-1.log (shown > below). Here the FelixFrameworkWiring thread has the Global bundle lock at > Felix level [1] and is waiting for the lock in > ResourceResolverFactoryActivator.checkFactoryPreconditions. While the > FelixStartLevel thread has the lock on RRF and is waiting for global lock. > Thus resulting in a deadlock > The FelixFrameworkWiring [5] is busy in deactivating components because of a > package refresh earlier (which lead to repository getting shutdown and thus > triggering deactivation of ResourceResolverFactoryActivator). While the > FelixStartLevel [6] thread has activated ResourceResolverFactoryActivator > (thus hold the lock) and later requires global lock for some operation. > Looking at the code for > ResourceResolverFactoryActivator.checkFactoryPreconditions [2] it appears to > take and hold a lock (on this) while making a call to OSGi container. Such a > usage *might* cause issues like deadlock. So it would be better if the > ResourceResolverFactoryActivator does not hold any lock while making the call > to container services [3] > "FelixFrameworkWiring" > - locked <0x00000007944da478> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterComponentService(AbstractComponentManager.java:702) > - locked <0x00000007944da9b0> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterComponentService(AbstractComponentManager.java:702) > - locked <0x00000007944dae38> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterComponentService(AbstractComponentManager.java:702) > - locked <0x0000000796d5d030> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterComponentService(AbstractComponentManager.java:702) > - waiting to lock <0x000000079624ff08> (a > org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator) > org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.checkFactoryPreconditions(ResourceResolverFactoryActivator.java:330) > "FelixStartLevel" > - locked <0x000000079624ff08> (a > org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator) > org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.checkFactoryPreconditions(ResourceResolverFactoryActivator.java:324) > - locked <0x0000000796959bc8> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.registerService(AbstractComponentManager.java:660) > - locked <0x0000000796959eb8> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.registerService(AbstractComponentManager.java:660) > - locked <0x000000079695a188> (a java.util.concurrent.atomic.AtomicReference) > org.apache.felix.scr.impl.manager.AbstractComponentManager.registerService(AbstractComponentManager.java:660) > - waiting <0x000000079415eca0> (a [Ljava.lang.Object;) > org.apache.felix.framework.Felix.acquireGlobalLock(Felix.java:5019) > [1] This has been confirmed via the value for m_globalLockThread of Felix > instance in Heap Dump > [2] > https://github.com/apache/sling/blob/trunk/bundles/resourceresolver/src/main/java/org/apache/sling/resourceresolver/impl/ResourceResolverFactoryActivator.java#L313 > [3] http://njbartlett.name/files/osgibook_preview_20091217.pdf (Section 6.4 > Don’t Hold Locks when Calling Foreign Code) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira