Alex Brdar created FELIX-5003:
---------------------------------

             Summary: Eclipse Equinox Region Digraph not being recomputed
                 Key: FELIX-5003
                 URL: https://issues.apache.org/jira/browse/FELIX-5003
             Project: Felix
          Issue Type: Bug
          Components: Bundle Repository (OBR), Framework
    Affects Versions: framework-4.6.1
         Environment: Red Hat Enterprise Linux Server release 6.5 (Santiago)
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)
            Reporter: Alex Brdar
            Priority: Critical
         Attachments: digraphs.tgz

We found this issue while upgrading from JDK 1.7 to JDK 1.8 on Adobe AEM. All 
of the bundle imports were cleaned up to migrate to the new JDK and everything 
works approrpriately on an Adobe AEM publisher instance. However in an author 
instance we get errors when the JVM is restarted and the bundles cannot be 
deployed again, our observations follow.

When we first deploy the code, it deploys successfully. If we restart the JVM, 
all of the bundles from our deployed package become undeployed and further 
attempts to install the package (even after complete deletion and removal of 
all packages, OSGi bundles, CRX directories, etc.) fail. Additionally once it's 
broken the only way back is to go to a backup of the and start again or to 
replace the digraph file (see below).

We have found a workaround:
- Install the package, which deploys all of the bundles into Felix OSGi
- In the OSGi webconsole (http://52.69.160.21:4503/system/console/vmstat) hit 
Restart and wait for all of the packages to complete loading.
- Now restart the JVM.

Following this procedure, everything works correctly.

I have tracked this down to the fact that the Eclipse Equinox Region Digraph 
seems to not be resolving the packages correctly. This is probably a symptom 
and not the issue itself, because the OSGi webconsole restart workaround 
(above) seems to recreate the digraph file, which helps this to resolve.

Attached in the file there are 3 digraph files:

-    startup-digraph: This is the digraph that is created on startup of the 
repository
-    failed-digraph: This is the digraph that is available when the server has 
been restarted via CLI without the workaround and fails to deploy packages
-    success-digraph: This is the digraph that is created after OSGi webconsole 
reload has been actioned and the server successfully restarted. We can use this 
file to replace the existing digraph file saved on the filesystem and the 
server will start properly, which is another workaround.

The reason that I think this is where the issue lies is because I have traced 
the calls when packages cannot be resolved and the resolution removes the 
correct resolved bundle in org.apache.felix.framework.StatefulResolver line 
260-264:
try
{
Felix.m_secureAction
.invokeResolverHookMatches(hook, (BundleRequirement)req, shrinkable);
}

This in turn eventually invokes 
org.eclipse.equinox.internal.region.hook.RegionResolverHook.filterMatches which 
removes the correct resolved bundle in this section of code (in the 
filterCandidates method line 55-61):
if (requirerRegion == null) {
// for singleton check; keep all collisions
if (!singleton) {
candidates.clear();
}
return;
}

And this fails because null is returned here (line 110-112):
private Region getRegion(Bundle bundle) {
return this.regionDigraph.getRegion(bundle);
}

The bundle being resolved enters the RegionResolverHook with this 
bundle/resolution information:
Requirer: com.springsource.com.mysql.jdbc_5.1.6[447]
Candidates:
Package javax.naming from provider org.apache.felix.framework_4.6.1.B001[0]

The OSGi failure message indicates that javax.naming couldn't be found, but the 
actual issue is that the digraph cannot resolve any Java packages in the 
deployed bundles once it gets into this state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to