[jira] [Updated] (SLING-3750) Delay discovery-service readiness until first vote has finished, to avoid leader being overthrown

2014-10-06 Thread Carsten Ziegeler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SLING-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carsten Ziegeler updated SLING-3750:

Fix Version/s: (was: Discovery Impl 1.0.12)
   Discovery Impl 1.0.14

> Delay discovery-service readiness until first vote has finished, to avoid 
> leader being overthrown
> -
>
> Key: SLING-3750
> URL: https://issues.apache.org/jira/browse/SLING-3750
> Project: Sling
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: Discovery Impl 1.0.8
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: Discovery Impl 1.0.14
>
>
> The current implementation of discovery.impl has a subtle problem at startup. 
> Consider the following problem happening with two simultaneous starts:
>  * two (sling) instances start at roughly the same time
>  * the goal is to write a service which runs on one of the two only, ever
>  * to achieve that, on a TopologyEventListener is used to get hold of the 
> latest TopologyView and derive whether the local instance is leader or not
>  * currently, upon registration of a TopologyEventListener, a TOPOLOGY_INIT 
> event is sent out immediately with the current TopologyView available
>  * right after startup though - hence before the first voting has passed - 
> discovery.impl considers itself to be in so-called "isolated" mode, creates a 
> topology which contains only itself, and makes itself leader (since every 
> cluster must have a leader)
>  * that means, both instances will receive that isolated view in the 
> TOPOLOGY_INIT and are marked as leader (which is kind of right as they don't 
> know about any other instance yet - but also wrong as it is not yet an 
> established view)
>  * at the same time, they both start voting, then find out about each other 
> and establish a view where one of the two is marked as leader - hence for the 
> other of the two a 'coup d'etat' is happening (the leader is overthrown even 
> though the instance did not crash). 
> This is certainly very problematic and should be avoided.
> The suggested way to avoid this is to delay both the time when the 
> discovery.impl service is registered with OSGi (by making it a @Component 
> only and registering it as a service explicitly after the first voting) - and 
> by delaying the sending of TOPOLOGY_INIT until again said first voting is 
> finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SLING-3750) Delay discovery-service readiness until first vote has finished, to avoid leader being overthrown

2014-07-23 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/SLING-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-3750:
---

Fix Version/s: (was: Discovery Impl 1.0.10)
   Discovery Impl 1.0.12

> Delay discovery-service readiness until first vote has finished, to avoid 
> leader being overthrown
> -
>
> Key: SLING-3750
> URL: https://issues.apache.org/jira/browse/SLING-3750
> Project: Sling
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: Discovery Impl 1.0.8
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
> Fix For: Discovery Impl 1.0.12
>
>
> The current implementation of discovery.impl has a subtle problem at startup. 
> Consider the following problem happening with two simultaneous starts:
>  * two (sling) instances start at roughly the same time
>  * the goal is to write a service which runs on one of the two only, ever
>  * to achieve that, on a TopologyEventListener is used to get hold of the 
> latest TopologyView and derive whether the local instance is leader or not
>  * currently, upon registration of a TopologyEventListener, a TOPOLOGY_INIT 
> event is sent out immediately with the current TopologyView available
>  * right after startup though - hence before the first voting has passed - 
> discovery.impl considers itself to be in so-called "isolated" mode, creates a 
> topology which contains only itself, and makes itself leader (since every 
> cluster must have a leader)
>  * that means, both instances will receive that isolated view in the 
> TOPOLOGY_INIT and are marked as leader (which is kind of right as they don't 
> know about any other instance yet - but also wrong as it is not yet an 
> established view)
>  * at the same time, they both start voting, then find out about each other 
> and establish a view where one of the two is marked as leader - hence for the 
> other of the two a 'coup d'etat' is happening (the leader is overthrown even 
> though the instance did not crash). 
> This is certainly very problematic and should be avoided.
> The suggested way to avoid this is to delay both the time when the 
> discovery.impl service is registered with OSGi (by making it a @Component 
> only and registering it as a service explicitly after the first voting) - and 
> by delaying the sending of TOPOLOGY_INIT until again said first voting is 
> finished.



--
This message was sent by Atlassian JIRA
(v6.2#6252)