On 02/07/2011 11:11 PM, renayama19661...@ybb.ne.jp wrote:
> Hi Steven,
> 
> I understood your opinion by mistake.
> 
> We do not have simple test case.
> 
> The phenomenon generated in our environment is the following thing.
> 
> Step 1) corosync constitutes a cluster in 12 nodes.
>  * begin communication in TOKEN
> 
> Step 2) One node raises [FAILED TO RECEIVE].
> 
> Step 3) 12 nodes begin the reconfiguration of the cluster again.
> 
> Step 4) The node that occurred fails([FAILED TO RECEIVE]) in an consensus of 
> the JOIN communication.
>  * Because the node failed in an consensus, node make contents of faildlist 
> and proclist same.
>  * And this node compares faildlist with proclist and assert-fail happened.
> 
> 
> When the node that made a cluster stood alone, I think that assert() is 
> unnecessary.
> 
> Because the reason is because there is the next processing.
> 
> 

Have a try of the patch i have sent to this ml.  If the issue persists,
we can look at more options.

Thanks!
-steve


> 
> static void memb_join_process (
>       struct totemsrp_instance *instance,
>       const struct memb_join *memb_join)
> {
>       struct srp_addr *proc_list;
>       struct srp_addr *failed_list;
> (snip)
>                               instance->failed_to_recv = 0;
>                               srp_addr_copy (&instance->my_proc_list[0],
>                                       &instance->my_id);
>                               instance->my_proc_list_entries = 1;
>                               instance->my_failed_list_entries = 0;
> 
>                               memb_state_commit_token_create (instance);
> 
>                               memb_state_commit_enter (instance);
>                               return;
> 
> (snip)
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> 
> --- renayama19661...@ybb.ne.jp wrote:
> 
>> Hi Steven,
>>
>>> Hideo,
>>>
>>> If you have a test case, I can make a patch for you to try.
>>>
>>
>> All right.
>>
>> We use corosync.1.3.0.
>>
>> Please send me patch.
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>> --- Steven Dake <sd...@redhat.com> wrote:
>>
>>> On 02/06/2011 09:16 PM, renayama19661...@ybb.ne.jp wrote:
>>>> Hi Steven,
>>>> Hi Dejan,
>>>>
>>>>>>>> This code never got a chance to run because on failed_to_recv
>>>>>>>> the two sets (my_process_list and my_failed_list) are equal which
>>>>>>>> makes the assert fail in memb_consensus_agreed():
>>>>
>>>> The same problem occurs, and we are troubled, too. 
>>>>
>>>> How did this argument turn out?
>>>>
>>>> Best Regards,
>>>> Hideo Yamauchi.
>>>>
>>>
>>> Hideo,
>>>
>>> If you have a test case, I can make a patch for you to try.
>>>
>>> Regards
>>> -steve
>>>
>>>>
>>>> --- Dejan Muhamedagic <de...@suse.de> wrote:
>>>>
>>>>> nudge, nudge
>>>>>
>>>>> On Wed, Jan 05, 2011 at 02:05:55PM +0100, Dejan Muhamedagic wrote:
>>>>>> On Tue, Jan 04, 2011 at 01:53:00PM -0700, Steven Dake wrote:
>>>>>>> On 12/23/2010 06:14 AM, Dejan Muhamedagic wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Wed, Dec 01, 2010 at 05:30:44PM +0200, Vladislav Bogdanov wrote:
>>>>>>>>> 01.12.2010 16:32, Dejan Muhamedagic wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On Tue, Nov 23, 2010 at 12:53:42PM +0200, Vladislav Bogdanov wrote:
>>>>>>>>>>> Hi Steven, hi all.
>>>>>>>>>>>
>>>>>>>>>>> I often see this assert on one of nodes after I stop corosync on 
>>>>>>>>>>> some
>>>>>>>>>>> another node in newly-setup 4-node cluster.
>>>>>>>>>>
>>>>>>>>>> Does the assert happen on a node lost event? Or once new
>>>>>>>>>> partition is formed?
>>>>>>>>>
>>>>>>>>> I first noticed it when I rebooted another node, just after console 
>>>>>>>>> said
>>>>>>>>> that OpenAIS is stopped.
>>>>>>>>>
>>>>>>>>> Can't say right now, what exactly event did it follow, I'm actually
>>>>>>>>> fighting with several problems with corosync, pacemaker, NFS4 and
>>>>>>>>> phantom uncorrectable ECC errors simultaneously and I'm a bit lost 
>>>>>>>>> with
>>>>>>>>> all of them.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> #0  0x00007f51953e49a5 in raise () from /lib64/libc.so.6
>>>>>>>>>>> #1  0x00007f51953e6185 in abort () from /lib64/libc.so.6
>>>>>>>>>>> #2  0x00007f51953dd935 in __assert_fail () from /lib64/libc.so.6
>>>>>>>>>>> #3  0x00007f5196176406 in memb_consensus_agreed
>>>>>>>>>>> (instance=0x7f5196554010) at totemsrp.c:1194
>>>>>>>>>>> #4  0x00007f519617b2f3 in memb_join_process 
>>>>>>>>>>> (instance=0x7f5196554010,
>>>>>>>>>>> memb_join=0x262f628) at totemsrp.c:3918
>>>>>>>>>>> #5  0x00007f519617b619 in message_handler_memb_join
>>>>>>>>>>> (instance=0x7f5196554010, msg=<value optimized out>, msg_len=<value
>>>>>>>>>>> optimized out>, endian_conversion_needed=<value optimized out>)
>>>>>>>>>>>     at totemsrp.c:4161
>>>>>>>>>>> #6  0x00007f5196173ba7 in passive_mcast_recv 
>>>>>>>>>>> (rrp_instance=0x2603030,
>>>>>>>>>>> iface_no=0, context=<value optimized out>, msg=<value optimized 
>>>>>>>>>>> out>,
>>>>>>>>>>> msg_len=<value optimized out>) at totemrrp.c:720
>>>>>>>>>>> #7  0x00007f5196172b44 in rrp_deliver_fn (context=<value optimized 
>>>>>>>>>>> out>,
>>>>>>>>>>> msg=0x262f628, msg_len=420) at totemrrp.c:1404
>>>>>>>>>>> #8  0x00007f5196171a76 in net_deliver_fn (handle=<value optimized 
>>>>>>>>>>> out>,
>>>>>>>>>>> fd=<value optimized out>, revents=<value optimized out>, 
>>>>>>>>>>> data=0x262ef80)
>>>>>>>>>>> at totemudp.c:1244
>>>>>>>>>>> #9  0x00007f519616d7f2 in poll_run (handle=4858364909567606784) at
>>>>>>>>>>> coropoll.c:510
>>>>>>>>>>> #10 0x0000000000406add in main (argc=<value optimized out>, 
>>>>>>>>>>> argv=<value
>>>>>>>>>>> optimized out>, envp=<value optimized out>) at main.c:1680
>>>>>>>>>>>
>>>>>>>>>>> Last fplay lines are:
>>>>>>>>>>>
>>>>>>>>>>> rec=[36124] Log Message=Delivering MCAST message with seq 1366 to
>>>>>>>>>>> pending delivery queue
>>>>>>>>>>> rec=[36125] Log Message=Delivering MCAST message with seq 1367 to
>>>>>>>>>>> pending delivery queue
>>>>>>>>>>> rec=[36126] Log Message=Received ringid(10.5.4.52:12660) seq 1366
>>>>>>>>>>> rec=[36127] Log Message=Received ringid(10.5.4.52:12660) seq 1367
>>>>>>>>>>> rec=[36128] Log Message=Received ringid(10.5.4.52:12660) seq 1366
>>>>>>>>>>> rec=[36129] Log Message=Received ringid(10.5.4.52:12660) seq 1367
>>>>>>>>>>> rec=[36130] Log Message=releasing messages up to and including 1367
>>>>>>>>>>> rec=[36131] Log Message=FAILED TO RECEIVE
>>>>>>>>>>> rec=[36132] Log Message=entering GATHER state from 6.
>>>>>>>>>>> rec=[36133] Log Message=entering GATHER state from 0.
>>>>>>>>>>> Finishing replay: records found [33993]
>>>>>>>>>>>
>>>>>>>>>>> What could be the reason for this? Bug, switches, memory errors?
>>>>>>>>>>
>>>>>>>>>> The assertion fails because corosync finds out that
>>>>>>>>>> instance->my_proc_list and instance->my_failed_list are
>>>>>>>>>> equal. That happens immediately after the "FAILED TO RECEIVE"
>>>>>>>>>> message which is issued when fail_recv_const token rotations
>>>>>>>>>> happened without any multicast packet received (defaults to 50).
>>>>>>>>
>>>>>>>> I took a look at the code and the protocol specification again
>>>>>>>> and it seems like that assert is not valid since Steve patched
>>>>>>>> the part dealing with the "FAILED TO RECEIVE" condition. The
>>>>>>>> patch is from 2010-06-03 posted to the list here
>>>>>>>> http://marc.info/?l=openais&m=127559807608484&w=2
>>>>>>>>
>>>>>>>> The last hunk of the patch contains this code (exec/totemsrp.c):
>>>>>>>>
>>>>>>>> 3933         if (memb_consensus_agreed (instance) && 
>>>>>>>> instance->failed_to_recv == 1) {   
>>  
>>>>>
>>>>>>>> 3934                 instance->failed_to_recv = 0;
>>>>>>>> 3935                 srp_addr_copy (&instance->my_proc_list[0],
>>>>>>>> 3936                     &instance->my_id);
>>>>>>>> 3937                 instance->my_proc_list_entries = 1;
>>>>>>>> 3938                 instance->my_failed_list_entries = 0;
>>>>>>>> 3939            
>>>>>>>> 3940                 memb_state_commit_token_create (instance);
>>>>>>>> 3941            
>>>>>>>> 3942                 memb_state_commit_enter (instance);
>>>>>>>> 3943                 return;
>>>>>>>> 3944         }
>>>>>>>>
>>>>>>>> This code never got a chance to run because on failed_to_recv
>>>>>>>> the two sets (my_process_list and my_failed_list) are equal which
>>>>>>>> makes the assert fail in memb_consensus_agreed():
>>>>>>>>
>>>>>>>> 1185     memb_set_subtract (token_memb, &token_memb_entries,
>>>>>>>> 1186         instance->my_proc_list, instance->my_proc_list_entries,
>>>>>>>> 1187         instance->my_failed_list, 
>>>>>>>> instance->my_failed_list_entries);
>>>>>>>> ...
>>>>>>>> 1195     assert (token_memb_entries >= 1);
>>>>>>>>
>>>>>>>> In other words, it's something like this:
>>>>>>>>
>>>>>>>>        if A:
>>>>>>>>                if memb_consensus_agreed() and failed_to_recv:
>>>>>>>>                        form a single node ring and try to recover
>>>>>>>>
>>>>>>>>        memb_consensus_agreed():
>>>>>>>>                assert(!A)
>>>>>>>>
>>>>>>>> Steve, can you take a look and confirm that this holds.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>
>>>>>>> Dejan,
>>>>>>>
>>>>>>> sorry for delay in response - big backlog which is mostly cleared out :)
>>>>>>
>>>>>> No problem.
>>>>>>
>>>>>>> The assert definitely isn't correct, but removing it without addressing
>>>>>>> the contents of the proc and fail lists is also not right.  That would
>>>>>>> cause the logic in the if statement at line 3933 not to be executed
>>>>>>> (because the first part of the if would evaluate to false)
>>>>>>
>>>>>> Actually it wouldn't. The agreed variable is set to 1 and it
>>>>>> is going to be returned unchanged.
>>>>>>
>>>>>>> I believe
>>>>>>> what we should do is check the "failed_to_recv" value in
>>>>>>> memb_consensus_agreed instead of at line 3933.
>>>>>>>
>>>>>>> The issue with this is memb_state_consensus_timeout_expired which also
>>>>>>> executes some 'then' logic where we may not want to execute the
>>>>>>> failed_to_recv logic.
>>>>>>
>>>>>> Perhaps we should just
>>>>>>
>>>>>> 3933         if (instance->failed_to_recv == 1) {
>>>>>>
>>>>>> ? In case failed_to_recv both proc and fail lists are equal so
>>>>>> checking for memb_consensus_agreed won't make sense, right?
>>>>>>
>>>>>>> If anyone has a reliable reproducer and can forward to me, I'll test out
>>>>>>> a change to address this problem.  Really hesitant to change anything in
>>>>>>> totemsrp without a test case for this problem - its almost perfect ;-)
>>>>>>
>>>>>> Since the tester upgraded the switch firmware they couldn't
>>>>>> reproduce it anymore.
>>>>>>
>>>>>> Would compiling with these help?
>>>>>>
>>>>>> /*
>>>>>>  * These can be used to test the error recovery algorithms
>>>>>>  * #define TEST_DROP_ORF_TOKEN_PERCENTAGE 30
>>>>>>  * #define TEST_DROP_COMMIT_TOKEN_PERCENTAGE 30
>>>>>>  * #define TEST_DROP_MCAST_PERCENTAGE 50
>>>>>>  * #define TEST_RECOVERY_MSG_COUNT 300
>>>>>>  */
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Dejan
>>>>>>
>>>>>>> Regards
>>>>>>> -steve
>>>>>>>
>>>>>>>> Dejan
>>>>>>>> _______________________________________________
>>>>>>>> Openais mailing list
>>>>>>>> Openais@lists.linux-foundation.org
>>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Openais mailing list
>>>>>>> Openais@lists.linux-foundation.org
>>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>>> _______________________________________________
>>>>>> Openais mailing list
>>>>>> Openais@lists.linux-foundation.org
>>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>> _______________________________________________
>>>>> Openais mailing list
>>>>> Openais@lists.linux-foundation.org
>>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>>
>>>>
>>>> _______________________________________________
>>>> Openais mailing list
>>>> Openais@lists.linux-foundation.org
>>>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>
>>>
>>
>> _______________________________________________
>> Openais mailing list
>> Openais@lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>
> 
> _______________________________________________
> Openais mailing list
> Openais@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to