[
https://issues.apache.org/jira/browse/FELIX-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Richard S. Hall resolved FELIX-961.
-----------------------------------
Resolution: Fixed
Fix Version/s: felix-1.6.0
I think TESTCASE2 was interesting, since it demonstrated that it is fairly easy
to get into fairly long running "uses" constraint violations without a really
complicated use case. In this particular case, the issue arose because bundles
were overriding packages provided by the system bundle. This resulted in lots
of imports that had two candidates, one from the system bundle and one from
another bundle.
This ended up being a worst case scenario since the system bundle export was
chosen first, since it was resolved already and resolved packages are given
preference. I noticed that some bundles eliminated the system bundle by
including a version range on their imports, but others did not. If all bundles
imported with an appropriate version range, then this issue would not have
appeared, since there wouldn't have been so many imports with multiple
candidates.
As a result, I implemented two "fixes" for this:
1. I noticed that some runs wouldn't take long depending on the order of the
bundles when calculating uses constraints, which changed since they were pulled
out of a map. If bundles with lots of potential candidates for their imports
were handled first, it seemed to be quicker. Thus, I modified the resolver to
first sort the bundles based the number of potential candidates they had. This
change got the resolver completing consistently in about 15 seconds with
testcase2.
2. The algorithm for permutating from one set of potential candidates to
another when a constraint violation is detected is exhaustive, but not very
smart. In an effort to make it a little smarter, I thought of a way to make it
permutate the candidates for a specific bundle when a constraint is detected
without losing the ability to be exhaustive; now the resolver rotates the
potential candidates for the bundle where the constraint violation was detected
and retests. This change got the resolver completing consistently in about 1
second with testcase2.
It is not clear if these fixes are general or specific to testcase2, but my
intuition says they should provide general improvements. However, they have not
fixed the worst case situation and it is still possible some set of bundles
could cause the resolver to go on for a very long time trying to find a
solution. The only solution here is to fail after a certain amount of time or
to completely rewrite the resolver.
> 100% CPU looping inside uses calculation
> ----------------------------------------
>
> Key: FELIX-961
> URL: https://issues.apache.org/jira/browse/FELIX-961
> Project: Felix
> Issue Type: Bug
> Components: Framework
> Affects Versions: felix-1.4.1
> Reporter: Stuart McCulloch
> Assignee: Richard S. Hall
> Fix For: felix-1.6.0
>
> Attachments: USES_TESTCASE.zip, USES_TESTCASE2.zip
>
>
> While investigating a problem report against pax-runner
> (http://article.gmane.org/gmane.comp.java.ops4j.general/6778) I found it was
> actually caused by a 100% CPU loop inside the "uses" calculation code. In
> Felix 1.4.1 this was stopping the shell bundle from activating, hence the
> lack of console. Using the trunk build I can get a console, but the looping
> still occurs with the testcase.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.