forgot to copy list.

Honza, lgtm.

Regards
-steve

On Thu, Jan 15, 2015 at 5:20 AM, Steven Dake <[email protected]> wrote:

> Honza,
>
> lgtm.
>
> regards
> -steve
>
> On Wed, Jan 14, 2015 at 10:19 AM, Jan Friesse <[email protected]> wrote:
>
>> Jason,
>> patch looks good. This touches very delicate part of protocol, so I
>> would really like to see also another reviewer comment. Chrissie, Steve?
>>
>> Regards,
>>   Honza
>>
>>
>> jason napsal(a):
>> > In active rrp mode, commit tokens are treated as mcast data messages,
>> > thus, rrp directly delivers them to srp layer by active_mcast_recv().
>> > This will result in duplicated commit tokens being received by srp
>> > from different heartbeat links. If node is in recovery state and has
>> > already sent out the initial orf token, those duplicated commit tokens
>> > will cause message_handler_memb_commit_token() to send initial orf
>> > token again! This is wrong because it resets the orf token content in
>> > instance->orf_token_retransmit, which breaks the token retransmission
>> > state.
>> >
>> > Furthermore, by sending those initial orf tokens again and again, it
>> > may lead active_token_recv() to drop some subsequent orf tokens. It is
>> > OK for rrp because srp will do token retransmission, but as said
>> > above, srp retransmission state has already been broken, so finally we
>> > meet a "token lost in recovery state" condition caused by software. If
>> > token timeout value is large, then it will takes long time to create a
>> > new ring.
>> >
>> > This can be reproduced by having two noded set to active rrp mode,
>> > with two heartbeat links. Then with one node always on, let the other
>> > one do stop/start again and again. It has a low probability to
>> > reproduce. In theory, I think, the more heartbeat links used, the more
>> > easily it can be reproduced.
>> >
>> > This problem can be resolved by letting
>> > message_handler_memb_commit_token() to ignore duplicated commit tokens
>> > in recovery state if node (the ring representation) has already sent
>> > out the initial orf token.
>> >
>> > Different from prev take, this version do not depends on stored token
>> > data but uses originated_orf_token in totemsrp_instance to remember if
>> > initial orf token has been already originated for current membership.
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > discuss mailing list
>> > [email protected]
>> > http://lists.corosync.org/mailman/listinfo/discuss
>> >
>>
>> _______________________________________________
>> discuss mailing list
>> [email protected]
>> http://lists.corosync.org/mailman/listinfo/discuss
>>
>
>
_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Reply via email to