[ 
https://issues.apache.org/jira/browse/KARAF-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619610#comment-13619610
 ] 

Amichai Rothman commented on KARAF-2256:
----------------------------------------

Great point. I've switched to equinox, and in the fiddling I did since then I 
couldn't recreate a deadlock. Although this is inconclusive proof (as race 
conditions and deadlocks often are), Felix is now looking like the main suspect.
                
> Deadlock when refreshing bundles
> --------------------------------
>
>                 Key: KARAF-2256
>                 URL: https://issues.apache.org/jira/browse/KARAF-2256
>             Project: Karaf
>          Issue Type: Bug
>    Affects Versions: 2.3.1
>         Environment: 64-bit Linux Oracle JDK 1.7.0_17
>            Reporter: Amichai Rothman
>            Assignee: Achim Nierbeck
>
> When attempting to install the DOSGi feature (by running "features:chooseurl 
> cxf-dosgi 1.4.0" and "features:install cxf-dosgi-discovery-distributed"), the 
> installation hangs along with some of the bundles which can no longer be 
> started, stopped, checked for imports, etc. - the Karaf server must be killed 
> and restarted to resume. This is likely not related to this specific feature, 
> and can happen with other refreshed bundles and installed features as well.
> At a glance it seems like this is caused by the "OPS4J Pax Web - Runtime 
> (1.1.12)" bundle being stuck in the stopping state due to a deadlock caused 
> by its Activator:
> It receives a removedService notification from a service tracker, which is 
> handled in a separate thread using a custom executor and eventually tries to 
> resolve some bundle and ends up waiting for acquireGlobalLock indefinitely.
> This is because at the same time, Felix calls refreshPackages which attempts 
> to stop the bundle (while holding the lock), whose activator puts a cleanup 
> task in its custom executor and then attempts to shut down the executor. This 
> never happens, because the previous executor task initiated from 
> removeService is waiting for the lock, hence the deadlock.
> I'm not entirely sure which of the projects has the underlying bug in it - 
> probably pax web, possibly Felix if the OSGi specs allow for the behavior 
> that hangs it, but in any case Karaf is using these versions and exhibiting 
> the deadlock, so at the very least should upgrade to fixed versions of these 
> libraries, or patch them.
> If anyone who knows these systems better thinks it should be reported in one 
> of the upstream projects, point me in the right direction and I'll be happy 
> to do it.
> Here is the thread dump, the top two threads show the deadlock, and the other 
> two are bundles which are stuck as well due to waiting for the same lock (I 
> think).
> "FelixFrameworkWiring" daemon prio=10 tid=0x00007f390002e000 nid=0x35d1 in 
> Object.wait() [0x00007f3948dd3000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000f1b8ea50> (a java.lang.Object)
>         at java.lang.Object.wait(Object.java:503)
>         at 
> org.ops4j.pax.web.service.internal.Executor.shutdown(Executor.java:91)
>         - locked <0x00000000f1b8ea50> (a java.lang.Object)
>         at 
> org.ops4j.pax.web.service.internal.Activator.stop(Activator.java:140)
>         at 
> org.apache.felix.framework.util.SecureAction.stopActivator(SecureAction.java:667)
>         at org.apache.felix.framework.Felix.stopBundle(Felix.java:2361)
>         at 
> org.apache.felix.framework.Felix$RefreshHelper.stop(Felix.java:4629)
>         at org.apache.felix.framework.Felix.refreshPackages(Felix.java:3951)
>         at 
> org.apache.felix.framework.FrameworkWiringImpl.run(FrameworkWiringImpl.java:172)
>         at java.lang.Thread.run(Thread.java:722)
> "Pax Web Runtime worker" daemon prio=10 tid=0x00007f3904263000 nid=0x370a in 
> Object.wait() [0x00007f390dfa8000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.felix.framework.Felix.acquireGlobalLock(Felix.java:4944)
>         - locked <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at 
> org.apache.felix.framework.StatefulResolver.resolve(StatefulResolver.java:219)
>         at 
> org.apache.felix.framework.BundleWiringImpl.searchDynamicImports(BundleWiringImpl.java:1539)
>         at 
> org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1439)
>         at 
> org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:72)
>         at 
> org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1843)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>         at 
> org.apache.felix.framework.BundleWiringImpl.getClassByDelegation(BundleWiringImpl.java:1317)
>         at 
> org.apache.felix.framework.ServiceRegistrationImpl$ServiceReferenceImpl.isAssignableTo(ServiceRegistrationImpl.java:521)
>         at 
> org.apache.felix.framework.util.Util.isServiceAssignable(Util.java:280)
>         at 
> org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:916)
>         at 
> org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793)
>         at 
> org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543)
>         at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260)
>         at org.apache.felix.framework.Felix.access$000(Felix.java:74)
>         at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:390)
>         at 
> org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:148)
>         at 
> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:127)
>         at 
> org.ops4j.pax.web.service.internal.Activator.updateController(Activator.java:231)
>         at 
> org.ops4j.pax.web.service.internal.Activator$DynamicsServiceTrackerCustomizer$2.run(Activator.java:387)
>         at 
> org.ops4j.pax.web.service.internal.Executor$Future.run(Executor.java:45)
>         at 
> org.ops4j.pax.web.service.internal.Executor$Worker.run(Executor.java:122)
> "fileinstall-/opt/apache-karaf-2.3.1/deploy" daemon prio=10 
> tid=0x00007f3904018800 nid=0x35a8 in Object.wait() [0x00007f394aba8000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4871)
>         - locked <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at org.apache.felix.framework.Felix.startBundle(Felix.java:1744)
>         at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:944)
>         at 
> org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundle(DirectoryWatcher.java:1247)
>         at 
> org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundles(DirectoryWatcher.java:1219)
>         at 
> org.apache.felix.fileinstall.internal.DirectoryWatcher.startAllBundles(DirectoryWatcher.java:1208)
>         at 
> org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:503)
>         at 
> org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:291)
> "NioProcessor-2" prio=10 tid=0x00007f3914014000 nid=0x35fd in Object.wait() 
> [0x00007f394a064000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4871)
>         - locked <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at org.apache.felix.framework.Felix.startBundle(Felix.java:1744)
>         at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:944)
>         at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:931)
>         at 
> org.apache.karaf.features.internal.FeaturesServiceImpl.installFeatures(FeaturesServiceImpl.java:479)
>         at 
> org.apache.karaf.features.internal.FeaturesServiceImpl.installFeature(FeaturesServiceImpl.java:396)
>         at 
> org.apache.karaf.features.internal.FeaturesServiceImpl.installFeature(FeaturesServiceImpl.java:392)
>         at 
> org.apache.karaf.features.command.InstallFeatureCommand.doExecute(InstallFeatureCommand.java:62)
>         at 
> org.apache.karaf.features.command.FeaturesCommandSupport.doExecute(FeaturesCommandSupport.java:41)
>         at 
> org.apache.karaf.shell.console.OsgiCommandSupport.execute(OsgiCommandSupport.java:38)
>         at 
> org.apache.felix.gogo.commands.basic.AbstractCommand.execute(AbstractCommand.java:35)
>         at 
> org.apache.felix.gogo.runtime.CommandProxy.execute(CommandProxy.java:78)
>         at org.apache.felix.gogo.runtime.Closure.executeCmd(Closure.java:474)
>         at 
> org.apache.felix.gogo.runtime.Closure.executeStatement(Closure.java:400)
>         at org.apache.felix.gogo.runtime.Pipe.run(Pipe.java:108)
>         at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:183)
>         at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:120)
>         at 
> org.apache.felix.gogo.runtime.CommandSessionImpl.execute(CommandSessionImpl.java:89)
>         at 
> org.apache.karaf.shell.ssh.ShellCommandFactory$ShellCommand$1.run(ShellCommandFactory.java:109)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.karaf.shell.ssh.ShellCommandFactory$ShellCommand.start(ShellCommandFactory.java:107)
>         at 
> org.apache.sshd.server.channel.ChannelSession.handleExec(ChannelSession.java:388)
>         at 
> org.apache.sshd.server.channel.ChannelSession.handleRequest(ChannelSession.java:235)
>         at 
> org.apache.sshd.server.channel.ChannelSession.handleRequest(ChannelSession.java:195)
>         at 
> org.apache.sshd.common.session.AbstractSession.channelRequest(AbstractSession.java:1057)
>         at 
> org.apache.sshd.server.session.ServerSession.running(ServerSession.java:229)
>         at 
> org.apache.sshd.server.session.ServerSession.handleMessage(ServerSession.java:205)
>         at 
> org.apache.sshd.common.session.AbstractSession.decode(AbstractSession.java:566)
>         at 
> org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:236)
>         - locked <0x00000000efd56b00> (a java.lang.Object)
>         at 
> org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:58)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceived(DefaultIoFilterChain.java:690)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:47)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:765)
>         at 
> org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:109)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
>         at 
> org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:410)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:710)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
>         at 
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
>         at 
> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to