On Thu, Apr 9, 2009 at 19:15, David Teigland <teigl...@redhat.com> wrote:
> On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote:
>> For added fun, a node that restarts quickly enough (think a VM) won't
>> even appear to have left (or rejoined) the cluster.
>> At the next totem confchg event, It will simply just be there again
>> with no indication that anything happened.
>> At least this is true for the raw corosync/openais membership data,
>> perhaps CPG can infer this some other way.
> Cpg should not let a node go away and come back without notice.  In practice
> I'd expect back to back confchg's: one showing it leave and another showing it
> join.

If you mean the raw confchg's that lcrsos see, then nope.
Try this, set token: to longer than your node takes to reboot and reboot a node.

For physical nodes this isn't a realistic scenario, but VMs can easily
boot in 10 seconds or so.

> As Chrissie mentioned earlier, cpg shouldn't show the same node both
> leaving and joining in a single confchg.  In theory I think it would be
> legitimate.
> Consider a couple examples.
> m: member list, j: joined list, l: left list
> 1. nodes A and B join at once
> A gets confchg: m=A,B j=A,B l=
> B gets confchg: m=A,B j=A,B l=
> 2. node C joins
> A gets confchg: m=A,B,C j=C l=
> B gets confchg: m=A,B,C j=C l=
> C gets confchg: m=A,B,C j=C l=
> 3. node C leaves and quickly rejoins in a single confchg
> A gets confchg: m=A,B,C j=C l=C
> B gets confchg: m=A,B,C j=C l=C
> C gets confchg: m=A,B,C j=C l=C
> 4. node D joins and quickly leaves (or fails) in a single confchg
> A gets confchg: m=A,B,C j=D l=D
> B gets confchg: m=A,B,C j=D l=D
> C gets confchg: m=A,B,C j=D l=D
> D gets confchg: m=A,B,C j=D l=D ?*
> * if D does a quick join+leave it may expect to see this confchg showing it in
> the joined list, the left list, and not in the member list.
> Again, the examples in 3 and 4 are, I think, legitimate in theory.  In
> practice it sounds like they won't occur.
> If a quick leave+join is guaranteed to be visible through cpg, then it must be
> possible to observe at the lower level from raw corosync data.
> Dave
Openais mailing list

Reply via email to