On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield <ccaul...@redhat.com> wrote:
> Joel Becker wrote:
>> Steve, Dave, etc,
>>       Someone told me a while back that a node joining a cpg group
>> would be by its lonesome in the join message.  That is, when the node
>> gets its first confchg, it will be the only node in the list of joins.
>> I've been using this to detect the first joiner of the group ("I joined,
>> and the member count is 1").
>>       Dave's since told me that this assumption is not valid (if it
>> ever was).  So two or more nodes can join in parallel, and each can see
>> more than node in the list of joins for its first confchg.  I'm now
>> trying to figure out an algorithm for "first joiner".  I have a couple
>> of questions:
>>
>> 1) If I see member_count == join_count, does that mean every member has
>> just joinded, and all the members are receiving the same join message?
>>
>> 2) If member_count == join_count, can leave_count be non-zero?  If it
>> is, am I guaranteed that we're looking at "all old members left, all new
>> members joined"?
>>
>>       If these both are true, I can simply isolate a "first joiner" by
>> checking member_count == join_count and selecting the lowest node
>> number.
>
>
> I don't think you can detect a first-joiner using CPG. cman does it by
> reading the totem confchg messages. It is quite possible for two nodes
> to join at the same time ... during the same SYNC phase so you certainly
> can't rely on that.
>
> 1) If member_count == join count, then it's a safe bet that they are all
> new nodes, and yes , it is true that all nodes should see the same
> confchg messages
>
> 2) if join_count > 0 then leave_count will always be zero. That's a
> consequence of how CPG sends its messages really, join and leave
> messages are always separate. Don't rely on this behaviour though!
> Although I can't see any reason to change it, I'd rather not have it
> burned into the defacto specification.

For added fun, a node that restarts quickly enough (think a VM) won't
even appear to have left (or rejoined) the cluster.
At the next totem confchg event, It will simply just be there again
with no indication that anything happened.

At least this is true for the raw corosync/openais membership data,
perhaps CPG can infer this some other way.
_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to