[ 
https://issues.apache.org/jira/browse/FELIX-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709807#comment-15709807
 ] 

Alexander Klimetschek edited comment on FELIX-5410 at 11/30/16 9:23 PM:
------------------------------------------------------------------------

To track the *origin of dynamically registered services*, a 
[ServiceListener|https://osgi.org/javadoc/r6/core/org/osgi/framework/ServiceListener.html]
 could be used. It would track the (last) dynamic unregistration of services 
and inspect the stack which looks something like this:

{noformat}
listener: at 
org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:120)
          at 
org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:991)
          at 
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:839)
          at 
org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:546)
          at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4557)
          at org.apache.felix.framework.Felix.access$000(Felix.java:106)
          at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:420)
          at 
org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:170)
          at 
org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
origin:   at 
org.apache.jackrabbit.oak.plugins.metric.StatisticsProviderFactory.deactivate(StatisticsProviderFactory.java:103)
{noformat}

This would be stored in a map of service -> origin (class name).

In contrast, a registration by SCR has a stacktrace where the origin is 
{{org.apache.felix.scr}}:

{noformat}
listnr: at 
org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:120)
        at 
org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:991)
        at 
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:839)
        at 
org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:546)
        at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4557)
        at org.apache.felix.framework.Felix.access$000(Felix.java:106)
        at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:420)
        at 
org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:170)
        at 
org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
scr:    at 
org.apache.felix.scr.impl.manager.AbstractComponentManager$3.unregister(AbstractComponentManager.java:883)
        at 
org.apache.felix.scr.impl.manager.AbstractComponentManager$3.unregister(AbstractComponentManager.java:857)
        at 
org.apache.felix.scr.impl.manager.RegistrationManager.changeRegistration(RegistrationManager.java:140)
        at 
org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterService(AbstractComponentManager.java:925)
{noformat}

In that case the origin would probably be the service implementation itself 
(which might fail to start because of an exception in its activate).

With the origin class/package is known, at troubleshooting time it can check on 
the bundle's state and possibly grep the error log file for any messages from 
that class or package and provide as hints.

Use case example: In our Sling based application, the JCR repository (database) 
is registered dynamically, and most of the application bundles depend on it 
directly or indirectly. Its startup can be prone to various low level 
exceptions (persistence problems, configuration issues), which prevent the 
dynamic registration. However, the exception message easily gets lost in the 
error log as usually there is a lot more going on when the repository restarts. 
A troubleshooting tool that can find this automatically (i.e. without knowing 
about the specific service names) would be useful.

The question is if getting the stacktrace for each service unregistration might 
be too costly. See 
http://stackoverflow.com/questions/2347828/how-expensive-is-thread-getstacktrace

While this is implementation specific (bound to Felix & requires knowing it's 
internal package names), for a troubleshooting tool this is ok. It can be 
adapted for newer Felix versions where things might change.



was (Author: alexander.klimetschek):
To track the *origin of dynamically registered services*, a 
[ServiceListener|https://osgi.org/javadoc/r6/core/org/osgi/framework/ServiceListener.html]
 could be used. It would track the (last) dynamic unregistration of services 
and inspect the stack which looks something like this:

{noformat}
listener: at 
org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:120)
          at 
org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:991)
          at 
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:839)
          at 
org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:546)
          at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4557)
          at org.apache.felix.framework.Felix.access$000(Felix.java:106)
          at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:420)
          at 
org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:170)
          at 
org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
origin:   at 
org.apache.jackrabbit.oak.plugins.metric.StatisticsProviderFactory.deactivate(StatisticsProviderFactory.java:103)
{noformat}

This would be stored in a map of service -> origin (class name).

In contrast, a registration by SCR has a stacktrace where the origin is 
{{org.apache.felix.scr}}:

{noformat}
listnr: at 
org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:120)
        at 
org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:991)
        at 
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:839)
        at 
org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:546)
        at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4557)
        at org.apache.felix.framework.Felix.access$000(Felix.java:106)
        at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:420)
        at 
org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:170)
        at 
org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
scr:    at 
org.apache.felix.scr.impl.manager.AbstractComponentManager$3.unregister(AbstractComponentManager.java:883)
        at 
org.apache.felix.scr.impl.manager.AbstractComponentManager$3.unregister(AbstractComponentManager.java:857)
        at 
org.apache.felix.scr.impl.manager.RegistrationManager.changeRegistration(RegistrationManager.java:140)
        at 
org.apache.felix.scr.impl.manager.AbstractComponentManager.unregisterService(AbstractComponentManager.java:925)
{noformat}

In that case the origin would probably be the service implementation itself 
(which might fail to start because of an exception in its activate).

With the origin class/package is known, at troubleshooting time it can check on 
the bundle's state and possibly grep the error log file for any messages from 
that class or package and provide as hints.

In our Sling based application, the JCR repository (database) is registered 
dynamically, and most of the application bundles depend on it directly or 
indirectly. It's startup can be prone to various low level exceptions 
(persistence problems, configuration issues), which prevent the dynamic 
registration. However, the exception message easily gets lost in the error log 
as usually there is a lot more going on when the repository restarts. A 
troubleshooting tool that can find this automatically (i.e. without knowing 
about the specific service names) would be useful.

The question is if getting the stacktrace for each service unregistration might 
be too costly. See 
http://stackoverflow.com/questions/2347828/how-expensive-is-thread-getstacktrace

While this is implementation specific (bound to Felix & requires knowing it's 
internal package names), for a troubleshooting tool this is ok. It can be 
adapted for newer Felix versions where things might change.


> Web console plugin for troubleshooting wiring issues
> ----------------------------------------------------
>
>                 Key: FELIX-5410
>                 URL: https://issues.apache.org/jira/browse/FELIX-5410
>             Project: Felix
>          Issue Type: New Feature
>          Components: Web Console
>            Reporter: Alexander Klimetschek
>         Attachments: FELIX-5410-with-services.patch, FELIX-5410.patch, 
> webconsole-troubleshoot-services.png, webconsole-troubleshoot.png
>
>
> h4. Feature
> Add a new view/plugin to the standard webconsole that helps to pin point 
> which bundles, services or components are the true source for inactive 
> bundles or services.
> * For *bundles* the underlying assumption would be a healthy system with all 
> bundles active, and thus any inactive can be shown and analyzed as being 
> problematic.
> * For *services/components* one can look at inactive _immediate_ services 
> that fail because of unsatisfied references. For others, the user might need 
> to enter the "problematic" service or component they expect to be running to 
> start the analysis.
> h4. Motivation
> In a larger OSGi application with many bundles and components, it can be 
> difficult to find out the root cause why certain bundles do not start or why 
> a service is not active, especially for folks new to OSGi or with limited 
> knowledge about the application. I have seen many people fail, and thus "not 
> like" OSGi because of such hurdles during development, where it is easy to 
> update on bundle but miss out on crucial dependencies.
> Figuring out is possible through the current web console, but only for 
> experts, if you click through the bundle or service details. This is usually 
> tedious work, if for example a lower level bundle is the problem, and 200 
> others are not active because of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to