[ 
https://issues.apache.org/jira/browse/FELIX-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard S. Hall resolved FELIX-961.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: felix-1.6.0

I think TESTCASE2 was interesting, since it demonstrated that it is fairly easy 
to get into fairly long running "uses" constraint violations without a really 
complicated use case. In this particular case, the issue arose because bundles 
were overriding packages provided by the system bundle. This resulted in lots 
of imports that had two candidates, one from the system bundle and one from 
another bundle.

This ended up being a worst case scenario since the system bundle export was 
chosen first, since it was resolved already and resolved packages are given 
preference. I noticed that some bundles eliminated the system bundle by 
including a version range on their imports, but others did not. If all bundles 
imported with an appropriate version range, then this issue would not have 
appeared, since there wouldn't have been so many imports with multiple 
candidates.

As a result, I implemented two "fixes" for this:

1. I noticed that some runs wouldn't take long depending on the order of the 
bundles when calculating uses constraints, which changed since they were pulled 
out of a map. If bundles with lots of potential candidates for their imports 
were handled first, it seemed to be quicker. Thus, I modified the resolver to 
first sort the bundles based the number of potential candidates they had. This 
change got the resolver completing consistently in about 15 seconds with 
testcase2.

2. The algorithm for permutating from one set of potential candidates to 
another when a constraint violation is detected is exhaustive, but not very 
smart. In an effort to make it a little smarter, I thought of a way to make it 
permutate the candidates for a specific bundle when a constraint is detected 
without losing the ability to be exhaustive; now the resolver rotates the 
potential candidates for the bundle where the constraint violation was detected 
and retests. This change got the resolver completing consistently in about 1 
second with testcase2.

It is not clear if these fixes are general or specific to testcase2, but my 
intuition says they should provide general improvements. However, they have not 
fixed the worst case situation and it is still possible some set of bundles 
could cause the resolver to go on for a very long time trying to find a 
solution. The only solution here is to fail after a certain amount of time or 
to completely rewrite the resolver.

> 100% CPU looping inside uses calculation
> ----------------------------------------
>
>                 Key: FELIX-961
>                 URL: https://issues.apache.org/jira/browse/FELIX-961
>             Project: Felix
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: felix-1.4.1
>            Reporter: Stuart McCulloch
>            Assignee: Richard S. Hall
>             Fix For: felix-1.6.0
>
>         Attachments: USES_TESTCASE.zip, USES_TESTCASE2.zip
>
>
> While investigating a problem report against pax-runner 
> (http://article.gmane.org/gmane.comp.java.ops4j.general/6778) I found it was 
> actually caused by a 100% CPU loop inside the "uses" calculation code. In 
> Felix 1.4.1 this was stopping the shell bundle from activating, hence the 
> lack of console. Using the trunk build I can get a console, but the looping 
> still occurs with the testcase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to