FYI: there is code in the heartbeat communication layer which is quite 
happy to simulate lost packets.

I made it difficult to turn on accidentally.  Read the code for details 
if you're interested.



On 04/30/2012 10:21 PM, renayama19661...@ybb.ne.jp wrote:
> Hi Lars,
>
> We confirmed that this problem occurred with v1 mode of Heartbeat.
>   * The problem happens with the v2 mode in the same way.
>
> We confirmed a problem in the next procedure.
>
> Step 1) Put a special device extinguishing a communication packet of 
> Heartbeat in the network.
>
> Step 2) Between nodes, the retransmission of the message is carried out 
> repeatedly.
>
> Step 3) Then the memory of the master process increases little by little.
>
>
> -------- As a result of the ps command of the master process ----------
> * node1
> (start)
> 32126 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: master 
> control process
> (One hour later)
> 32126 ?        SLs    0:03      0   182 54729  7868  0.0 heartbeat: master 
> control process
> (Two hour later)
> 32126 ?        SLs    0:08      0   182 55317  8456  0.0 heartbeat: master 
> control process
> (Four hours later)
> 32126 ?        SLs    0:24      0   182 56673  9812  0.0 heartbeat: master 
> control process
>
> * node2
> (start)
> 31928 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: master 
> control process
> (One hour later)
> 31928 ?        SLs    0:02      0   182 54481  7620  0.0 heartbeat: master 
> control process
> (Two hour later)
> 31928 ?        SLs    0:08      0   182 55353  8492  0.0 heartbeat: master 
> control process
> (Four hours later)
> 31928 ?        SLs    0:23      0   182 56689  9828  0.0 heartbeat: master 
> control process
>
>
> The state of the memory leak seems to vary according to a node with the 
> quantity of the retransmission.
>
> The increase of this memory disappears by applying my patch.
>
> And the similar correspondence seems to be necessary in send_reqnodes_msg(), 
> but this is like little leak.
>
> Best Regards,
> Hideo Yamauchi.
>
>
> --- On Sat, 2012/4/28, renayama19661...@ybb.ne.jp<renayama19661...@ybb.ne.jp> 
>  wrote:
>
>> Hi Lars,
>>
>> Thank you for comments.
>>
>>> Have you actually been able to measure that memory leak you observed,
>>> and you can confirm this patch will fix it?
>>>
>>> Because I don't think this patch has any effect.
>> Yes.
>> I really measured leak.
>> I can show a result next week.
>> #Japan is a holiday until Tuesday.
>>
>>> send_rexmit_request() is only used as paramter to
>>> Gmain_timeout_add_full, and it returns FALSE always,
>>> which should cause the respective sourceid to be auto-removed.
>> It seems to be necessary to release gsource somehow or other.
>> The similar liberation seems to be carried out in lrmd.
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>>
>> --- On Fri, 2012/4/27, Lars Ellenberg<lars.ellenb...@linbit.com>  wrote:
>>
>>> On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp wrote:
>>>> Hi All,
>>>>
>>>> We gave test that assumed remote cluster environment.
>>>> And we tested packet lost.
>>>>
>>>> The retransmission timer of Heartbeat causes memory leak.
>>>>
>>>> I donate a patch.
>>>> Please confirm the contents of the patch.
>>>> And please reflect a patch in a repository of Heartbeat.
>>> Have you actually been able to measure that memory leak you observed,
>>> and you can confirm this patch will fix it?
>>>
>>> Because I don't think this patch has any effect.
>>>
>>> send_rexmit_request() is only used as paramter to
>>> Gmain_timeout_add_full, and it returns FALSE always,
>>> which should cause the respective sourceid to be auto-removed.
>>>
>>>
>>>> diff -r 106ca984041b heartbeat/hb_rexmit.c
>>>> --- a/heartbeat/hb_rexmit.c    Thu Apr 26 19:28:26 2012 +0900
>>>> +++ b/heartbeat/hb_rexmit.c    Thu Apr 26 19:31:44 2012 +0900
>>>> @@ -164,6 +164,8 @@
>>>>        seqno_t seq = (seqno_t) ri->seq;
>>>>        struct node_info* node = ri->node;
>>>>        struct ha_msg*    hmsg;
>>>> +    unsigned long           sourceid;
>>>> +    gpointer value;
>>>>   
>>>>        if (STRNCMP_CONST(node->status, UPSTATUS) != 0&&
>>>>            STRNCMP_CONST(node->status, ACTIVESTATUS) !=0) {
>>>> @@ -196,11 +198,17 @@
>>>>       
>>>>        node->track.last_rexmit_req = time_longclock();   
>>>>       
>>>> -    if (!g_hash_table_remove(rexmit_hash_table, ri)){
>>>> -        cl_log(LOG_ERR, "%s: entry not found in rexmit_hash_table"
>>>> -               "for seq/node(%ld %s)",                
>>>> -               __FUNCTION__, ri->seq, ri->node->nodename);
>>>> -        return FALSE;
>>>> +    value = g_hash_table_lookup(rexmit_hash_table, ri);
>>>> +    if ( value != NULL) {
>>>> +        sourceid = (unsigned long) value;
>>>> +        Gmain_timeout_remove(sourceid);
>>>> +
>>>> +        if (!g_hash_table_remove(rexmit_hash_table, ri)){
>>>> +            cl_log(LOG_ERR, "%s: entry not found in rexmit_hash_table"
>>>> +                       "for seq/node(%ld %s)",                
>>>> +                       __FUNCTION__, ri->seq, ri->node->nodename);
>>>> +            return FALSE;
>>>> +        }
>>>>        }
>>>>       
>>>>        schedule_rexmit_request(node, seq, max_rexmit_delay);
>>>
>>> -- 
>>> : Lars Ellenberg
>>> : LINBIT | Your Way to High Availability
>>> : DRBD/HA support and consulting http://www.linbit.com
>>>
>>> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
>>> _______________________________________________________
>>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>> Home Page: http://linux-ha.org/
>>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/


-- 
     Alan Robertson<al...@unix.sh>  - @OSSAlanR

"Openness is the foundation and preservative of friendship...  Let me claim 
from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to