On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. > > At least this is true for the raw corosync/openais membership data, > perhaps CPG can infer this some other way.
Cpg should not let a node go away and come back without notice. In practice I'd expect back to back confchg's: one showing it leave and another showing it join. As Chrissie mentioned earlier, cpg shouldn't show the same node both leaving and joining in a single confchg. In theory I think it would be legitimate. Consider a couple examples. m: member list, j: joined list, l: left list 1. nodes A and B join at once A gets confchg: m=A,B j=A,B l= B gets confchg: m=A,B j=A,B l= 2. node C joins A gets confchg: m=A,B,C j=C l= B gets confchg: m=A,B,C j=C l= C gets confchg: m=A,B,C j=C l= 3. node C leaves and quickly rejoins in a single confchg A gets confchg: m=A,B,C j=C l=C B gets confchg: m=A,B,C j=C l=C C gets confchg: m=A,B,C j=C l=C 4. node D joins and quickly leaves (or fails) in a single confchg A gets confchg: m=A,B,C j=D l=D B gets confchg: m=A,B,C j=D l=D C gets confchg: m=A,B,C j=D l=D D gets confchg: m=A,B,C j=D l=D ?* * if D does a quick join+leave it may expect to see this confchg showing it in the joined list, the left list, and not in the member list. Again, the examples in 3 and 4 are, I think, legitimate in theory. In practice it sounds like they won't occur. If a quick leave+join is guaranteed to be visible through cpg, then it must be possible to observe at the lower level from raw corosync data. Dave _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais