[ 
https://issues.apache.org/jira/browse/FELIX-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997543#comment-12997543
 ] 

Richard S. Hall commented on FELIX-2400:
----------------------------------------

Did you ever get around to testing this on a 3.0.x release of the framework?

> High contention (or deadlock) in PackageAdmin and StartLevel 
> -------------------------------------------------------------
>
>                 Key: FELIX-2400
>                 URL: https://issues.apache.org/jira/browse/FELIX-2400
>             Project: Felix
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: framework-2.0.5
>         Environment: Felix 2.0.5
> java version "1.6.0_12"
> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
> SunOS castor 5.10 Generic_138888-06 sun4u sparc SUNW,Sun-Fire-V890
>            Reporter: Alexander Berger
>
> Imagine the following code:
> void createProblem(PackageAdmin pa, StartLevel sl, Bundle bundles[], int 
> level){
>    for ( final Bundle b : bundles) {
>       sl.setBundleStartLevel(b, level);
>    }
>    pa.refreshPackages(null);
>    pa.resolveBundles(null);
> }
> If there have been many bundles updated or uninstalled the code above might 
> create what looks like a deadlock (see Stack traces below)
> but in fact is a high contention problem. On our system (16 core Sun Sparcv9, 
> 64GB) with about 20 bundles (all updated, so refresh will be busy) 
> this will result in very poor runtime performance, it will take about 30 to 
> 60 minutes for pa.resolveBundles(null) to return.
> The problem lies in the asynchronous nature of 
> setBundleStartLevel/refreshPackages and the way that Felix uses locking 
> (acquireGlobalLock and acquireBundleLock). For example the following code 
> works fine (and for pa.resolveBundles(null) returns within some seconds) but 
> poses the problem of how to implement "magicWait":
> void createNoProblem(PackageAdmin pa, StartLevel sl, Bundle bundles[], int 
> level){
>    for ( final Bundle b : bundles) {
>       sl.setBundleStartLevel(b, level);
>    }
>    // wait until the asynchronous sl.setBundleStartLevel logic has finished
>    magicWait(sl);
>    pa.refreshPackages(null);
>    // wait until the asynchronous pa.refreshPackages logic has finished
>    magicWait(pa); 
>    pa.resolveBundles(null);
> }
> At the moment I solved the problem by patching PackageAdminImpl like this (I 
> know this is an ugly solution buts its only a show case):
> public boolean isDone() {
>    synchronized(this) {
>       final Bundle tmp[][] = m_reqBundles;
>       return tmp == null || tmp.length == 0;
>    }
> }
> And implementing magicWait like this:
> void magicWait(final PackageAdmin pa){
>     final Method method = pa.getClass().getMethod("isDone");
>     method.setAccessible(true);
>     while ( ! (Boolean)method.invoke(pa) ) {
>        Thread.yield();
>     }
> }
> Then I did something similar for StartLevel. 
> For me this patch/work around is fine for the moment but I think the problem 
> should be investigated and solved in the Felix framework.
> "FelixPackageAdmin" daemon prio=3 tid=0x00000001005ac800 nid=0x1a in 
> Object.wait() [0xffffffff4f6fe000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0xffffffff554000e0> (a [Ljava.lang.Object;)
>       at java.lang.Object.wait(Object.java:485)
>       at org.apache.felix.framework.Felix.acquireGlobalLock(Felix.java:4535)
>       - locked <0xffffffff554000e0> (a [Ljava.lang.Object;)
>       at org.apache.felix.framework.Felix.refreshPackages(Felix.java:3314)
>       at 
> org.apache.felix.framework.PackageAdminImpl.run(PackageAdminImpl.java:331)
>       at java.lang.Thread.run(Unknown Source)
>    Locked ownable synchronizers:
>       - None
>       
> "FelixStartLevel" daemon prio=3 tid=0x0000000100848000 nid=0x19 in 
> Object.wait() [0xffffffff4f8fe000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0xffffffff554000e0> (a [Ljava.lang.Object;)
>       at java.lang.Object.wait(Object.java:485)
>       at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4462)
>       - locked <0xffffffff554000e0> (a [Ljava.lang.Object;)
>       at org.apache.felix.framework.Felix.setBundleStartLevel(Felix.java:1266)
>       at 
> org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:270)
>       at java.lang.Thread.run(Unknown Source)
>    Locked ownable synchronizers:
>       - None
>       
> "OSKi" prio=3 tid=0x00000001006ea800 nid=0x1b runnable [0xffffffff4f4fd000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> org.apache.felix.framework.searchpolicy.ResolvedPackage.clone(ResolvedPackage.java:62)
>       at 
> org.apache.felix.framework.searchpolicy.Resolver.isClassSpaceConsistent(Resolver.java:846)
>       at 
> org.apache.felix.framework.searchpolicy.Resolver.isClassSpaceConsistent(Resolver.java:807)
>       at 
> org.apache.felix.framework.searchpolicy.Resolver.isClassSpaceConsistent(Resolver.java:807)
>       at 
> org.apache.felix.framework.searchpolicy.Resolver.findConsistentClassSpace(Resolver.java:549)
>       at 
> org.apache.felix.framework.searchpolicy.Resolver.resolve(Resolver.java:103)
>       at 
> org.apache.felix.framework.Felix$FelixResolver.resolve(Felix.java:3861)
>       at org.apache.felix.framework.Felix.resolveBundle(Felix.java:3292)
>       at org.apache.felix.framework.Felix.resolveBundles(Felix.java:3267)
>       at 
> org.apache.felix.framework.PackageAdminImpl.resolveBundles(PackageAdminImpl.java:288)
>             at Test.createProblem(Test.java:10)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to