[
https://issues.apache.org/jira/browse/KARAF-6224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Baptiste Onofré updated KARAF-6224:
----------------------------------------
Fix Version/s: 4.2.6
4.3.0
> Race condition in BaseActivator on first launch
> -----------------------------------------------
>
> Key: KARAF-6224
> URL: https://issues.apache.org/jira/browse/KARAF-6224
> Project: Karaf
> Issue Type: Bug
> Components: karaf
> Affects Versions: 4.0.10, 4.1.7, 4.2.4
> Reporter: Kurt Westerfeld
> Assignee: Jean-Baptiste Onofré
> Priority: Critical
> Fix For: 4.3.0, 4.2.6
>
>
> We have several karaf containers we run on single machine that contains a
> large number of cores (20). The machine core count is high so this may be a
> hard problem to reproduce. We have customized the RMI and JMX ports for each
> of the containers so that they do not conflict. However, after the first
> karaf VM is launched and claims ports 1099/44444, the second VM will attempt
> to do the same briefly before its customized configuration can be read from
> the ${karaf.etc} directory. You can see that the management bundle gets
> started and then a configuration update will happen immediately with the
> corrected values.
> In looking over BaseActivator, it seems that a thread is created to dispatch
> the initialization and sometimes this thread will encounter a null field
> "config" before the asynchronous managed service event arrives. In this
> case, the configuration is missing and defaults will be used. Because of
> this, ports 1099 and 44444 are temporarily attempted to be used until the
> first managed service event arrives with the updated() method. Immediately
> after that, the service reconfigures and uses the proper customized values.
> This is a problem for us because at times this temporary event can cause a
> client to mistakenly connect to the wrong container. We use JMX over RMI to
> perform a number of management operations and this initial startup is
> unreliable. Our three karaf containers have some interdependencies that this
> temporary condition is causing problems with.
> This problem does not occur as often on subsequent restarts, which means that
> initial provisioning of the ${karaf.etc} must be racing here. We have seen
> it happen, however, although rarer, at any time. It is believed that the
> high core count of the server this happens to be running on results in the
> race condition.
> Suggested fix is to make a call to config admin at run() to read the
> configuration if this.config is null. This would handle the race here but it
> could cause other bad interactions with config admin? Not sure.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)