[
https://issues.apache.org/jira/browse/FELIX-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997543#comment-12997543
]
Richard S. Hall commented on FELIX-2400:
----------------------------------------
Did you ever get around to testing this on a 3.0.x release of the framework?
> High contention (or deadlock) in PackageAdmin and StartLevel
> -------------------------------------------------------------
>
> Key: FELIX-2400
> URL: https://issues.apache.org/jira/browse/FELIX-2400
> Project: Felix
> Issue Type: Bug
> Components: Framework
> Affects Versions: framework-2.0.5
> Environment: Felix 2.0.5
> java version "1.6.0_12"
> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
> SunOS castor 5.10 Generic_138888-06 sun4u sparc SUNW,Sun-Fire-V890
> Reporter: Alexander Berger
>
> Imagine the following code:
> void createProblem(PackageAdmin pa, StartLevel sl, Bundle bundles[], int
> level){
> for ( final Bundle b : bundles) {
> sl.setBundleStartLevel(b, level);
> }
> pa.refreshPackages(null);
> pa.resolveBundles(null);
> }
> If there have been many bundles updated or uninstalled the code above might
> create what looks like a deadlock (see Stack traces below)
> but in fact is a high contention problem. On our system (16 core Sun Sparcv9,
> 64GB) with about 20 bundles (all updated, so refresh will be busy)
> this will result in very poor runtime performance, it will take about 30 to
> 60 minutes for pa.resolveBundles(null) to return.
> The problem lies in the asynchronous nature of
> setBundleStartLevel/refreshPackages and the way that Felix uses locking
> (acquireGlobalLock and acquireBundleLock). For example the following code
> works fine (and for pa.resolveBundles(null) returns within some seconds) but
> poses the problem of how to implement "magicWait":
> void createNoProblem(PackageAdmin pa, StartLevel sl, Bundle bundles[], int
> level){
> for ( final Bundle b : bundles) {
> sl.setBundleStartLevel(b, level);
> }
> // wait until the asynchronous sl.setBundleStartLevel logic has finished
> magicWait(sl);
> pa.refreshPackages(null);
> // wait until the asynchronous pa.refreshPackages logic has finished
> magicWait(pa);
> pa.resolveBundles(null);
> }
> At the moment I solved the problem by patching PackageAdminImpl like this (I
> know this is an ugly solution buts its only a show case):
> public boolean isDone() {
> synchronized(this) {
> final Bundle tmp[][] = m_reqBundles;
> return tmp == null || tmp.length == 0;
> }
> }
> And implementing magicWait like this:
> void magicWait(final PackageAdmin pa){
> final Method method = pa.getClass().getMethod("isDone");
> method.setAccessible(true);
> while ( ! (Boolean)method.invoke(pa) ) {
> Thread.yield();
> }
> }
> Then I did something similar for StartLevel.
> For me this patch/work around is fine for the moment but I think the problem
> should be investigated and solved in the Felix framework.
> "FelixPackageAdmin" daemon prio=3 tid=0x00000001005ac800 nid=0x1a in
> Object.wait() [0xffffffff4f6fe000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xffffffff554000e0> (a [Ljava.lang.Object;)
> at java.lang.Object.wait(Object.java:485)
> at org.apache.felix.framework.Felix.acquireGlobalLock(Felix.java:4535)
> - locked <0xffffffff554000e0> (a [Ljava.lang.Object;)
> at org.apache.felix.framework.Felix.refreshPackages(Felix.java:3314)
> at
> org.apache.felix.framework.PackageAdminImpl.run(PackageAdminImpl.java:331)
> at java.lang.Thread.run(Unknown Source)
> Locked ownable synchronizers:
> - None
>
> "FelixStartLevel" daemon prio=3 tid=0x0000000100848000 nid=0x19 in
> Object.wait() [0xffffffff4f8fe000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xffffffff554000e0> (a [Ljava.lang.Object;)
> at java.lang.Object.wait(Object.java:485)
> at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4462)
> - locked <0xffffffff554000e0> (a [Ljava.lang.Object;)
> at org.apache.felix.framework.Felix.setBundleStartLevel(Felix.java:1266)
> at
> org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:270)
> at java.lang.Thread.run(Unknown Source)
> Locked ownable synchronizers:
> - None
>
> "OSKi" prio=3 tid=0x00000001006ea800 nid=0x1b runnable [0xffffffff4f4fd000]
> java.lang.Thread.State: RUNNABLE
> at
> org.apache.felix.framework.searchpolicy.ResolvedPackage.clone(ResolvedPackage.java:62)
> at
> org.apache.felix.framework.searchpolicy.Resolver.isClassSpaceConsistent(Resolver.java:846)
> at
> org.apache.felix.framework.searchpolicy.Resolver.isClassSpaceConsistent(Resolver.java:807)
> at
> org.apache.felix.framework.searchpolicy.Resolver.isClassSpaceConsistent(Resolver.java:807)
> at
> org.apache.felix.framework.searchpolicy.Resolver.findConsistentClassSpace(Resolver.java:549)
> at
> org.apache.felix.framework.searchpolicy.Resolver.resolve(Resolver.java:103)
> at
> org.apache.felix.framework.Felix$FelixResolver.resolve(Felix.java:3861)
> at org.apache.felix.framework.Felix.resolveBundle(Felix.java:3292)
> at org.apache.felix.framework.Felix.resolveBundles(Felix.java:3267)
> at
> org.apache.felix.framework.PackageAdminImpl.resolveBundles(PackageAdminImpl.java:288)
> at Test.createProblem(Test.java:10)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira