Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-22 Thread Andrew Beekhof
And you'll also want this patch for the crmd

diff -r 4619c842d58c crmd/callbacks.c
--- a/crmd/callbacks.c  Fri May 22 16:52:14 2009 +0200
+++ b/crmd/callbacks.c  Fri May 22 21:34:12 2009 +0200
@@ -179,7 +179,6 @@ crmd_ha_msg_callback(HA_Message *hamsg,

} else {
crmd_ha_msg_filter(msg);
-   return;
}

   bail:


On Wed, May 20, 2009 at 2:47 PM, Nikola Ciprich  wrote:
> On Wed, May 20, 2009 at 02:02:52PM +0200, Andrew Beekhof wrote:
>> Ah, well that was pretty obvious.
>> /me humbly apologizes for such a stupid error.
> Hi and thanks! no problem
>
>
>> (It wasn't caught by my own valgrind testing because this function is
>> specific to heartbeat based clusters)
> don't worry, I'm doing a lots of testing for you ;)
> I've already compiled it an deployed on testing machines,
> memory usage seems to be pretty low. I'll report
> few days later if everything is OK.
> thanks a lot once more!
> nik
>
>>
>>
>> Try this:
>>
>> diff -r ea5d0b58c0be cib/callbacks.c
>> --- a/cib/callbacks.c Wed May 20 11:56:39 2009 +0200
>> +++ b/cib/callbacks.c Wed May 20 14:01:30 2009 +0200
>> @@ -1064,6 +1064,7 @@ cib_ha_peer_callback(HA_Message * msg, v
>>  {
>>      xmlNode *xml = convert_ha_message(NULL, msg, __FUNCTION__);
>>      cib_peer_callback(xml, private_data);
>> +    free_xml(xml);
>>  }
>>
>>  void
>>
>>
>>
>>
>> On Tue, May 19, 2009 at 8:24 PM, Andrew Beekhof  wrote:
>> > I'll take a look at the valgrind data.  Thanks!
>> >
>> > On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich  
>> > wrote:
>> >> Hello,
>> >> sorry to bother again. I've discovered why valgrind didn't
>> >> find anything. It is important to stop the process in order to
>> >> have valgrind finish the analysis. And it seems that there
>> >> really are leaks not only in cib, but also in attrd and crmd.
>> >> I just had a slight look into the code reported by valgrind
>> >> as problematic and though I would certainly need to examine
>> >> it much more to understand it properly, I think there are
>> >> leaks. I'm attaching the valgrind reports, In case You would be
>> >> interested in examining them.
>> >> If I could provide any help, I'll be more than happy.
>> >> (well, I guess I could of course help by sending patches :) but I'm
>> >> afraid this will take me a lot of time, I can try though).
>> >> with best regards
>> >> nik
>> >>
>> >>> Not really. Sorry :(
>> >>>
>> >>
>> >> --
>> >> -
>> >> Nikola CIPRICH
>> >> LinuxBox.cz, s.r.o.
>> >> 28. rijna 168, 709 01 Ostrava
>> >>
>> >> tel.:   +420 596 603 142
>> >> fax:    +420 596 621 273
>> >> mobil:  +420 777 093 799
>> >>
>> >> www.linuxbox.cz
>> >>
>> >> mobil servis: +420 737 238 656
>> >> email servis: ser...@linuxbox.cz
>> >> -
>> >>
>> >
>>
>

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-20 Thread Nikola Ciprich
On Wed, May 20, 2009 at 02:02:52PM +0200, Andrew Beekhof wrote:
> Ah, well that was pretty obvious.
> /me humbly apologizes for such a stupid error.
Hi and thanks! no problem


> (It wasn't caught by my own valgrind testing because this function is
> specific to heartbeat based clusters)
don't worry, I'm doing a lots of testing for you ;)
I've already compiled it an deployed on testing machines,
memory usage seems to be pretty low. I'll report
few days later if everything is OK.
thanks a lot once more!
nik

> 
> 
> Try this:
> 
> diff -r ea5d0b58c0be cib/callbacks.c
> --- a/cib/callbacks.c Wed May 20 11:56:39 2009 +0200
> +++ b/cib/callbacks.c Wed May 20 14:01:30 2009 +0200
> @@ -1064,6 +1064,7 @@ cib_ha_peer_callback(HA_Message * msg, v
>  {
>  xmlNode *xml = convert_ha_message(NULL, msg, __FUNCTION__);
>  cib_peer_callback(xml, private_data);
> +free_xml(xml);
>  }
> 
>  void
> 
> 
> 
> 
> On Tue, May 19, 2009 at 8:24 PM, Andrew Beekhof  wrote:
> > I'll take a look at the valgrind data.  Thanks!
> >
> > On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich  
> > wrote:
> >> Hello,
> >> sorry to bother again. I've discovered why valgrind didn't
> >> find anything. It is important to stop the process in order to
> >> have valgrind finish the analysis. And it seems that there
> >> really are leaks not only in cib, but also in attrd and crmd.
> >> I just had a slight look into the code reported by valgrind
> >> as problematic and though I would certainly need to examine
> >> it much more to understand it properly, I think there are
> >> leaks. I'm attaching the valgrind reports, In case You would be
> >> interested in examining them.
> >> If I could provide any help, I'll be more than happy.
> >> (well, I guess I could of course help by sending patches :) but I'm
> >> afraid this will take me a lot of time, I can try though).
> >> with best regards
> >> nik
> >>
> >>> Not really. Sorry :(
> >>>
> >>
> >> --
> >> -
> >> Nikola CIPRICH
> >> LinuxBox.cz, s.r.o.
> >> 28. rijna 168, 709 01 Ostrava
> >>
> >> tel.:   +420 596 603 142
> >> fax:    +420 596 621 273
> >> mobil:  +420 777 093 799
> >>
> >> www.linuxbox.cz
> >>
> >> mobil servis: +420 737 238 656
> >> email servis: ser...@linuxbox.cz
> >> -
> >>
> >
> 

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-20 Thread Andrew Beekhof
Ah, well that was pretty obvious.
/me humbly apologizes for such a stupid error.

(It wasn't caught by my own valgrind testing because this function is
specific to heartbeat based clusters)


Try this:

diff -r ea5d0b58c0be cib/callbacks.c
--- a/cib/callbacks.c   Wed May 20 11:56:39 2009 +0200
+++ b/cib/callbacks.c   Wed May 20 14:01:30 2009 +0200
@@ -1064,6 +1064,7 @@ cib_ha_peer_callback(HA_Message * msg, v
 {
 xmlNode *xml = convert_ha_message(NULL, msg, __FUNCTION__);
 cib_peer_callback(xml, private_data);
+free_xml(xml);
 }

 void




On Tue, May 19, 2009 at 8:24 PM, Andrew Beekhof  wrote:
> I'll take a look at the valgrind data.  Thanks!
>
> On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich  
> wrote:
>> Hello,
>> sorry to bother again. I've discovered why valgrind didn't
>> find anything. It is important to stop the process in order to
>> have valgrind finish the analysis. And it seems that there
>> really are leaks not only in cib, but also in attrd and crmd.
>> I just had a slight look into the code reported by valgrind
>> as problematic and though I would certainly need to examine
>> it much more to understand it properly, I think there are
>> leaks. I'm attaching the valgrind reports, In case You would be
>> interested in examining them.
>> If I could provide any help, I'll be more than happy.
>> (well, I guess I could of course help by sending patches :) but I'm
>> afraid this will take me a lot of time, I can try though).
>> with best regards
>> nik
>>
>>> Not really. Sorry :(
>>>
>>
>> --
>> -
>> Nikola CIPRICH
>> LinuxBox.cz, s.r.o.
>> 28. rijna 168, 709 01 Ostrava
>>
>> tel.:   +420 596 603 142
>> fax:    +420 596 621 273
>> mobil:  +420 777 093 799
>>
>> www.linuxbox.cz
>>
>> mobil servis: +420 737 238 656
>> email servis: ser...@linuxbox.cz
>> -
>>
>

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-19 Thread Andrew Beekhof
I'll take a look at the valgrind data.  Thanks!

On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich  wrote:
> Hello,
> sorry to bother again. I've discovered why valgrind didn't
> find anything. It is important to stop the process in order to
> have valgrind finish the analysis. And it seems that there
> really are leaks not only in cib, but also in attrd and crmd.
> I just had a slight look into the code reported by valgrind
> as problematic and though I would certainly need to examine
> it much more to understand it properly, I think there are
> leaks. I'm attaching the valgrind reports, In case You would be
> interested in examining them.
> If I could provide any help, I'll be more than happy.
> (well, I guess I could of course help by sending patches :) but I'm
> afraid this will take me a lot of time, I can try though).
> with best regards
> nik
>
>> Not really. Sorry :(
>>
>
> --
> -
> Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 01 Ostrava
>
> tel.:   +420 596 603 142
> fax:    +420 596 621 273
> mobil:  +420 777 093 799
>
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: ser...@linuxbox.cz
> -
>

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-19 Thread Nikola Ciprich
Hello,
sorry to bother again. I've discovered why valgrind didn't
find anything. It is important to stop the process in order to 
have valgrind finish the analysis. And it seems that there 
really are leaks not only in cib, but also in attrd and crmd. 
I just had a slight look into the code reported by valgrind 
as problematic and though I would certainly need to examine 
it much more to understand it properly, I think there are
leaks. I'm attaching the valgrind reports, In case You would be 
interested in examining them.
If I could provide any help, I'll be more than happy.
(well, I guess I could of course help by sending patches :) but I'm
afraid this will take me a lot of time, I can try though).
with best regards
nik

> Not really. Sorry :(
> 

-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


valgrind.tar.gz
Description: GNU Zip compressed data
___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-18 Thread Andrew Beekhof
On Sat, May 16, 2009 at 10:33 PM, Nikola Ciprich
 wrote:
> Hi guys,
> I was able to enable valgrind on our production cluster today,
> but unfortunately only on the secondary node, I'll be allowed to enable
> it on primary node hopefully during next weekend.
> Unfortunately it seems that valgrind probably won't be of much help here.
> I've got some output from it, but it's only few warnings and it seems
> that growing memory consumption is not really caused by leak, but (maybe)
> only by some growing memory structure. I'm doing one not very nice thing
> in my cluster which might be the culprit:
> I'm monitoring some service by a cron script and periodically changing
> related resource score by the following command:
>
> cibadmin -U -o constraints -X "
>        
>            role="Master">
>               operation="eq" value="${host}"/>
>           
>        
> Is it possible that this could be causing cib growing memory consumption?

Anything is possible, but it would be unlikely.
There's nothing special about that command that would make only it leak.

> Anyways, I'm attaching valgrind output for cib process:
>
> ==14779== My PID = 14779, parent PID = 14766.  Prog and args are:
> ==14779==    /usr/lib64/heartbeat/cib
> ==14779==

> Can this help?

Not really. Sorry :(

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-16 Thread Nikola Ciprich
Hi guys,
I was able to enable valgrind on our production cluster today,
but unfortunately only on the secondary node, I'll be allowed to enable 
it on primary node hopefully during next weekend.
Unfortunately it seems that valgrind probably won't be of much help here.
I've got some output from it, but it's only few warnings and it seems 
that growing memory consumption is not really caused by leak, but (maybe)
only by some growing memory structure. I'm doing one not very nice thing
in my cluster which might be the culprit:
I'm monitoring some service by a cron script and periodically changing
related resource score by the following command:

cibadmin -U -o constraints -X "

   
  
   

Is it possible that this could be causing cib growing memory consumption?
Anyways, I'm attaching valgrind output for cib process:

==14779== My PID = 14779, parent PID = 14766.  Prog and args are:
==14779==/usr/lib64/heartbeat/cib
==14779==
==14779== Conditional jump or move depends on uninitialised value(s)
==14779==at 0x674E354: (within /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x674CDA5: (within /usr/lib64/libxml2.so.2.6.26)
 gz   ==14779==by 0x674CD5E: 
(within /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x674CD5E: (within /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x674C77D: (within /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x6751853: xmlXPathEvalExpression (in 
/usr/lib64/libxml2.so.2.6.26)
==14779==by 0x4E3CB58: xpath_search (xml.c:2545)
==14779==by 0x50567BE: cib_process_xpath (cib_ops.c:880)
==14779==by 0x5053CB3: cib_process_query (cib_ops.c:49)
==14779==by 0x5057F3E: cib_perform_op (cib_utils.c:539)
==14779==by 0x40AFBD: cib_process_command (callbacks.c:843)
==14779==by 0x40A3FC: cib_process_request (callbacks.c:660)
==14779==by 0x408E7E: cib_common_callback_worker (callbacks.c:259)
==14779==by 0x4090EE: cib_common_callback (callbacks.c:315)
==14779==by 0x408C4C: cib_rw_callback (callbacks.c:206)
==14779==by 0x5E69858: G_CH_dispatch_int (GSource.c:624)
==14779==by 0x739FDB3: g_main_context_dispatch (in 
/lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73A2C0C: (within /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73A2F19: g_main_loop_run (in /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x40D3F0: cib_init (main.c:508) │ ││
 │   │
==14779==by 0x40C8AE: main (main.c:217)
==14779== Conditional jump or move depends on uninitialised value(s)
==14779==at 0x674E354: (within /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x674CDA5: (within /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x674C77D: (within /usr/lib64/libxml2.so.2.6.26)
  ==14779==by 0x6751853: 
xmlXPathEvalExpression (in /usr/lib64/libxml2.so.2.6.26)
==14779==by 0x4E3CB58: xpath_search (xml.c:2545)
==14779==by 0x50567BE: cib_process_xpath (cib_ops.c:880)
==14779==by 0x5053CB3: cib_process_query (cib_ops.c:49)
==14779==by 0x5057F3E: cib_perform_op (cib_utils.c:539)
==14779==by 0x40AFBD: cib_process_command (callbacks.c:843)
==14779==by 0x40A3FC: cib_process_request (callbacks.c:660)
==14779==by 0x408E7E: cib_common_callback_worker (callbacks.c:259)
==14779==by 0x4090EE: cib_common_callback (callbacks.c:315)
==14779==by 0x408C4C: cib_rw_callback (callbacks.c:206)
==14779==by 0x5E69858: G_CH_dispatch_int (GSource.c:624)
==14779==by 0x739FDB3: g_main_context_dispatch (in 
/lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73A2C0C: (within /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73A2F19: g_main_loop_run (in /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x40D3F0: cib_init (main.c:508)
==14779==by 0x40C8AE: main (main.c:217)
==14779== Syscall param unlink(pathname) points to uninitialised byte(s)
==14779==at 0x6ACCC27: unlink (in /lib64/libc-2.5.so)
==14779==by 0x5E6BBC5: socket_destroy_channel (ipcsocket.c:870)
==14779==by 0x5E6780A: G_CH_destroy_int (GSource.c:677)
==14779==by 0x739F74C: (within /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x739FEB9: g_main_context_dispatch (in 
/lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73A2C0C: (within /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73A2F19: g_main_loop_run (in /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x40D3F0: cib_init (main.c:508)
==14779==by 0x40C8AE: main (main.c:217)
==14779==  Address 0x4092A72 is 2 bytes inside a block of size 110 alloc'd
==14779==at 0x4C20809: malloc (vg_replace_malloc.c:149)
==14779==by 0x73A6BFA: g_malloc (in /lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x5E6B7AE: socket_accept_connection (ipcsocket.c:708)
==14779==by 0x5E69364: G_WC_dispatch (GSource.c:830)
==14779==by 0x739FDB3: g_main_context_dispatch (in 
/lib64/libglib-2.0.so.0.1200.3)
==14779==by 0x73

Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-14 Thread Nikola Ciprich
Hi guys,
sooo I've got valgrind grinding:)
I had some trouble getting the latest stuff working, so I used heartbeat-2.99.2 
with Dejan's (fixed) patch and --enable-valgrind 
--with-valgrind-log="--log-file=/tmp/crm-%p.valgrind"
and recompiled pacemaker-1.0.3 (withount openais as Andrew suggested).
now enabling valgrind works!
Unfortulately I don't see the leaks on my testing machine, so I'll have to try 
it directly on
production one. Hopefully I'll have some time for playing Tomorrow or during 
the weekend, so I'll
report ASAP.
thanks a lot for all Your help!
best regards
nik

On Thu, May 14, 2009 at 04:12:52PM +0200, Andrew Beekhof wrote:
> On Thu, May 14, 2009 at 3:58 PM, Nikola Ciprich  
> wrote:
> > Hi,
> > Dejan, thanks a lot, I compiled Your version, but crmd with shipped 
> > pacemaker keeps segfaulting
> > with it, and unable to rebuild pacemaker with this heartbeat to get the 
> > -debug package.
> > compilation fails with:
> >
> > plugin.c: In function 'check_message_sanity':
> > plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has 
> > type 'long unsigned int'
> > plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has 
> > type 'long unsigned int'
> > gmake[2]: *** [plugin.lo] Error 1
> > gmake[2]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib/ais'
> > gmake[1]: *** [all-recursive] Error 1
> > gmake[1]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib'
> > make: *** [all-recursive] Error 1
> > error: Bad exit status from /var/tmp/rpm-tmp.81431 (%build)
> >
> > Could You please send me only the related patch, so I could try compiling 
> > latest stable
> > version? I don't see it in the mercurial...
> 
> When you configure pacemaker, just add the --without-ais option.
> 
> >
> > Andrew thanks for Your patches as well, I'll try them, but honestly I'm a 
> > bit confused,
> > first patch is for heartbeat, right?
> 
> actually, you probably dont need the second one.  i think its in 1.0 already.
> 

-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-14 Thread Andrew Beekhof
On Thu, May 14, 2009 at 3:58 PM, Nikola Ciprich  wrote:
> Hi,
> Dejan, thanks a lot, I compiled Your version, but crmd with shipped pacemaker 
> keeps segfaulting
> with it, and unable to rebuild pacemaker with this heartbeat to get the 
> -debug package.
> compilation fails with:
>
> plugin.c: In function 'check_message_sanity':
> plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has 
> type 'long unsigned int'
> plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has 
> type 'long unsigned int'
> gmake[2]: *** [plugin.lo] Error 1
> gmake[2]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib/ais'
> gmake[1]: *** [all-recursive] Error 1
> gmake[1]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib'
> make: *** [all-recursive] Error 1
> error: Bad exit status from /var/tmp/rpm-tmp.81431 (%build)
>
> Could You please send me only the related patch, so I could try compiling 
> latest stable
> version? I don't see it in the mercurial...

When you configure pacemaker, just add the --without-ais option.

>
> Andrew thanks for Your patches as well, I'll try them, but honestly I'm a bit 
> confused,
> first patch is for heartbeat, right?

actually, you probably dont need the second one.  i think its in 1.0 already.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-14 Thread Nikola Ciprich
Hi,
Dejan, thanks a lot, I compiled Your version, but crmd with shipped pacemaker 
keeps segfaulting
with it, and unable to rebuild pacemaker with this heartbeat to get the -debug 
package.
compilation fails with:

plugin.c: In function 'check_message_sanity':
plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has 
type 'long unsigned int'
plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has 
type 'long unsigned int'
gmake[2]: *** [plugin.lo] Error 1
gmake[2]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib/ais'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib'
make: *** [all-recursive] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.81431 (%build)

Could You please send me only the related patch, so I could try compiling 
latest stable 
version? I don't see it in the mercurial...

Andrew thanks for Your patches as well, I'll try them, but honestly I'm a bit 
confused,
first patch is for heartbeat, right? and the second one for pacemaker? It 
doesn't seem 
to apply either to -tip, or to 1.0.3...
BR
nik


-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-13 Thread Andrew Beekhof
On Wed, May 13, 2009 at 7:41 PM, Dejan Muhamedagic  wrote:
> Hi,
>
> On Wed, May 13, 2009 at 05:36:40PM +0200, Nikola Ciprich wrote:
>> > holy !
>> yes! exactly! :)
>>
>> > sure
>> > in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf
>>
>> hmm, i tried that now, but all I got is:
>> May 13 16:46:16 faxb heartbeat: [1655]: ERROR: Heartbeat was not compiled 
>> with --enable-libc-malloc, "crm valgrind" is therefor not supported.
>>
>> So I wanted to compile myself, but I see this option neither in
>> pacemaker's configure, nor in heartbeat's.  But I noticed
>> --enable-valgrind option for heartbeat configure,
>> but enabling it and recompiling the heartbeat didn't help.  so
>> maybe this part needs some updating?
>
> Looks like it. Just pushed a patch for that. Can you try again
> with the new tarball:
>
> http://hg.linux-ha.org/dev/archive/6467be4d4cb7.tar.bz2

Thanks Dejan!

Nikola, I also suggest the following two patches

diff -r 4038c4644964 configure.in
--- a/configure.in  Wed May 13 17:07:22 2009 +0200
+++ b/configure.in  Wed May 13 20:48:05 2009 +0200
@@ -2799,17 +2799,14 @@ AC_ARG_WITH(valgrind-suppress,
 [ VALGRIND_SUPP="/dev/null" ])

 if test "x" = "x$VALGRIND_LOG"; then
-VALGRIND_LOG="--log-socket=127.0.0.1:1234"
-AC_MSG_NOTICE(Set default Valgrind options to: $VALGRIND_OPTS)
-AC_MSG_NOTICE(Remember to start a receiver on localhost:1234)
+VALGRIND_LOG="--log-file=/tmp/crm-%p.valgrind"
 fi

-AC_PATH_PROG(VALGRIND_BIN, valgrind)
 if test "xyes" = "x$enable_valgrind" -a "x$VALGRIND_BIN" != "x"; then
enable_libc_malloc=yes
 fi

-AC_DEFINE_UNQUOTED(VALGRIND_BIN, "$VALGRIND_BIN", Valgrind command)
+AC_DEFINE_UNQUOTED(VALGRIND_BIN, "valgrind", Valgrind command)
 AC_DEFINE_UNQUOTED(VALGRIND_LOG, "$VALGRIND_LOG", Valgrind logging options)
 AC_DEFINE_UNQUOTED(VALGRIND_SUPP, "$VALGRIND_SUPP", Name of a
suppression file to pass to Valgrind)

diff -r 4038c4644964 crm/crmd/subsystems.c
--- a/crm/crmd/subsystems.c Wed May 13 17:07:22 2009 +0200
+++ b/crm/crmd/subsystems.c Wed May 13 20:48:05 2009 +0200
@@ -148,6 +148,7 @@ start_subsystem(struct crm_subsystem_s* 
unsigned intj;
struct rlimit   oflimits;
const char  *devnull = "/dev/null";
+   const char  *grind = getenv("HA_VALGRIND_ENABLED");

crm_info("Starting sub-system \"%s\"", the_subsystem->name);
set_bit_inplace(fsa_input_register, the_subsystem->flag_required);
@@ -211,7 +212,8 @@ start_subsystem(struct crm_subsystem_s* 
(void)open(devnull, O_WRONLY);  /* Stdout: fd 1 */
(void)open(devnull, O_WRONLY);  /* Stderr: fd 2 */

-   if(getenv("HA_VALGRIND_ENABLED") != NULL) {
+   if(grind != NULL
+  && (crm_is_true(grind) || strstr(grind, the_subsystem->name))) {
char *opts[] = { crm_strdup(VALGRIND_BIN),
 crm_strdup("--show-reachable=yes"),
 crm_strdup("--leak-check=full"),

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-13 Thread Dejan Muhamedagic
Hi,

On Wed, May 13, 2009 at 05:36:40PM +0200, Nikola Ciprich wrote:
> > holy !
> yes! exactly! :)
> 
> > sure
> > in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf
> 
> hmm, i tried that now, but all I got is:
> May 13 16:46:16 faxb heartbeat: [1655]: ERROR: Heartbeat was not compiled 
> with --enable-libc-malloc, "crm valgrind" is therefor not supported.
> 
> So I wanted to compile myself, but I see this option neither in
> pacemaker's configure, nor in heartbeat's.  But I noticed
> --enable-valgrind option for heartbeat configure,
> but enabling it and recompiling the heartbeat didn't help.  so
> maybe this part needs some updating?

Looks like it. Just pushed a patch for that. Can you try again
with the new tarball:

http://hg.linux-ha.org/dev/archive/6467be4d4cb7.tar.bz2

Thanks,

Dejan

> BR
> nik
> 
> 
> >
> > did this not work?
> >
> > ___
> > Pacemaker mailing list
> > Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> 
> -- 
> -
> Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 01 Ostrava
> 
> tel.:   +420 596 603 142
> fax:+420 596 621 273
> mobil:  +420 777 093 799
> www.linuxbox.cz
> 
> mobil servis: +420 737 238 656
> email servis: ser...@linuxbox.cz
> -
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-13 Thread Nikola Ciprich
> holy !
yes! exactly! :)

> sure
> in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf

hmm, i tried that now, but all I got is:
May 13 16:46:16 faxb heartbeat: [1655]: ERROR: Heartbeat was not compiled with 
--enable-libc-malloc, "crm valgrind" is therefor not supported.

So I wanted to compile myself, but I see this option neither in pacemaker's 
configure, nor in heartbeat's.
But I noticed --enable-valgrind option for heartbeat configure, but enabling it 
and recompiling the heartbeat didn't help.
so maybe this part needs some updating?
BR
nik


>
> did this not work?
>
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>

-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-13 Thread Andrew Beekhof


On May 13, 2009, at 8:28 AM, Nikola Ciprich wrote:


Hello,
I've reported this some time ago, few days ago I've updated my  
system to pacemaker-1.0.3 + related packages.
But unfortunately cib process seems to be still leaking,ie it's RSS  
memory usage is constantly growing.
This means we have to restart whole heartbeat service approximately  
once every two weeks as the memory usage of cib process gets to  
~1.5GB.


holy !

Some time ago when I was trying to use valgrind, I had some trouble,  
Andrew, You wrote that You're mostly testing openais variant, and  
it's possible that heartbeat has some problems being started with  
valgrind. could You please help me with running the cib process with  
valgrind so I could provide more accurate repport?


sure
in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf

did this not work?

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-12 Thread Nikola Ciprich
Hello,
I've reported this some time ago, few days ago I've updated my system to 
pacemaker-1.0.3 + related packages.
But unfortunately cib process seems to be still leaking,ie it's RSS memory 
usage is constantly growing.
This means we have to restart whole heartbeat service approximately once every 
two weeks as the memory usage of cib process gets to ~1.5GB.
Some time ago when I was trying to use valgrind, I had some trouble, Andrew, 
You wrote that You're mostly testing openais variant, and it's possible that 
heartbeat has some problems being started with valgrind. could You please help 
me with running the cib process with valgrind so I could provide more accurate 
repport?
thank You very much in advance.
best regards
nik

-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker