[ 
https://issues.apache.org/jira/browse/KARAF-6224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated KARAF-6224:
----------------------------------------
    Fix Version/s: 4.2.6
                   4.3.0

> Race condition in BaseActivator on first launch
> -----------------------------------------------
>
>                 Key: KARAF-6224
>                 URL: https://issues.apache.org/jira/browse/KARAF-6224
>             Project: Karaf
>          Issue Type: Bug
>          Components: karaf
>    Affects Versions: 4.0.10, 4.1.7, 4.2.4
>            Reporter: Kurt Westerfeld
>            Assignee: Jean-Baptiste Onofré
>            Priority: Critical
>             Fix For: 4.3.0, 4.2.6
>
>
> We have several karaf containers we run on single machine that contains a 
> large number of cores (20).  The machine core count is high so this may be a 
> hard problem to reproduce.  We have customized the RMI and JMX ports for each 
> of the containers so that they do not conflict.  However, after the first 
> karaf VM is launched and claims ports 1099/44444, the second VM will attempt 
> to do the same briefly before its customized configuration can be read from 
> the ${karaf.etc} directory.   You can see that the management bundle gets 
> started and then a configuration update will happen immediately with the 
> corrected values.
> In looking over BaseActivator, it seems that a thread is created to dispatch 
> the initialization and sometimes this thread will encounter a null field 
> "config" before the asynchronous managed service event arrives.  In this 
> case, the configuration is missing and defaults will be used.  Because of 
> this, ports 1099 and 44444 are temporarily attempted to be used until the 
> first managed service event arrives with the updated() method.   Immediately 
> after that, the service reconfigures and uses the proper customized values.
> This is a problem for us because at times this temporary event can cause a 
> client to mistakenly connect to the wrong container.  We use JMX over RMI to 
> perform a number of management operations and this initial startup is 
> unreliable.  Our three karaf containers have some interdependencies that this 
> temporary condition is causing problems with.
> This problem does not occur as often on subsequent restarts, which means that 
> initial provisioning of the ${karaf.etc} must be racing here.  We have seen 
> it happen, however, although rarer, at any time.  It is believed that the 
> high core count of the server this happens to be running on results in the 
> race condition.
> Suggested fix is to make a call to config admin at run() to read the 
> configuration if this.config is null.  This would handle the race here but it 
> could cause other bad interactions with config admin?  Not sure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to