"Andrew Beekhof" <[EMAIL PROTECTED]> writes:
(snip)
>>  Here's my observation:
>>
>>   - An element of pending_ops is removed at lrm.c:L497
>>   - It is called inside from g_has_table_foreach() at L1475
>>   - This is violating the usage of g_has_table_foreach() according
>>    to the glib manual.
>>   - Therefore the iteration can not proceed correctly and would
>>    try to refer to a removed element.
>
> Turns out that the Stateful resource in CTS was never getting promoted.
> Once I fixed this, I was able to trigger the bug too (in the last few 
> minutes).

A weird thing is that, it is not reproducable on every environments.

As far as we've tested:
 - it _always_ happens on a RedHat 4 environment.
 - it has _never_ happened on a RedHat 5 environment.

I'm not sure if it's the only difference but
possibly the difference of the glib versions may be related to 
the behavior.


>
> Thanks for your diagnosis and the patch, you've certainly saved me some time 
> :-)
>
>>
>>  http://hg.linux-ha.org/lha-2.1/annotate/333aef5bd4ed/crm/crmd/lrm.c
>>  (...)
>>  946             /* not doing this will block the node from shutting down */
>>  947             g_hash_table_remove(pending_ops, key);
>>  (...)
>>  1475            g_hash_table_foreach(pending_ops, 
>> stop_recurring_action_by_rsc, rsc);
>>
>>  
>> http://library.gnome.org/devel/glib/stable/glib-Hash-Tables.html#g-hash-table-foreach
>>  (...)
>>  The hash table may not be modified while iterating over it (you can't 
>> add/remove items).
>>
>>
>>  I also attached my suggested patch, although I can not guarantee
>>  the correctness but just to show you the idea.
>>
>>  Thanks,

-- 
Keisuke MORI
NTT DATA Intellilink Corporation
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to