Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
And you'll also want this patch for the crmd diff -r 4619c842d58c crmd/callbacks.c --- a/crmd/callbacks.c Fri May 22 16:52:14 2009 +0200 +++ b/crmd/callbacks.c Fri May 22 21:34:12 2009 +0200 @@ -179,7 +179,6 @@ crmd_ha_msg_callback(HA_Message *hamsg, } else { crmd_ha_msg_filter(msg); - return; } bail: On Wed, May 20, 2009 at 2:47 PM, Nikola Ciprich wrote: > On Wed, May 20, 2009 at 02:02:52PM +0200, Andrew Beekhof wrote: >> Ah, well that was pretty obvious. >> /me humbly apologizes for such a stupid error. > Hi and thanks! no problem > > >> (It wasn't caught by my own valgrind testing because this function is >> specific to heartbeat based clusters) > don't worry, I'm doing a lots of testing for you ;) > I've already compiled it an deployed on testing machines, > memory usage seems to be pretty low. I'll report > few days later if everything is OK. > thanks a lot once more! > nik > >> >> >> Try this: >> >> diff -r ea5d0b58c0be cib/callbacks.c >> --- a/cib/callbacks.c Wed May 20 11:56:39 2009 +0200 >> +++ b/cib/callbacks.c Wed May 20 14:01:30 2009 +0200 >> @@ -1064,6 +1064,7 @@ cib_ha_peer_callback(HA_Message * msg, v >> { >> xmlNode *xml = convert_ha_message(NULL, msg, __FUNCTION__); >> cib_peer_callback(xml, private_data); >> + free_xml(xml); >> } >> >> void >> >> >> >> >> On Tue, May 19, 2009 at 8:24 PM, Andrew Beekhof wrote: >> > I'll take a look at the valgrind data. Thanks! >> > >> > On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich >> > wrote: >> >> Hello, >> >> sorry to bother again. I've discovered why valgrind didn't >> >> find anything. It is important to stop the process in order to >> >> have valgrind finish the analysis. And it seems that there >> >> really are leaks not only in cib, but also in attrd and crmd. >> >> I just had a slight look into the code reported by valgrind >> >> as problematic and though I would certainly need to examine >> >> it much more to understand it properly, I think there are >> >> leaks. I'm attaching the valgrind reports, In case You would be >> >> interested in examining them. >> >> If I could provide any help, I'll be more than happy. >> >> (well, I guess I could of course help by sending patches :) but I'm >> >> afraid this will take me a lot of time, I can try though). >> >> with best regards >> >> nik >> >> >> >>> Not really. Sorry :( >> >>> >> >> >> >> -- >> >> - >> >> Nikola CIPRICH >> >> LinuxBox.cz, s.r.o. >> >> 28. rijna 168, 709 01 Ostrava >> >> >> >> tel.: +420 596 603 142 >> >> fax: +420 596 621 273 >> >> mobil: +420 777 093 799 >> >> >> >> www.linuxbox.cz >> >> >> >> mobil servis: +420 737 238 656 >> >> email servis: ser...@linuxbox.cz >> >> - >> >> >> > >> > ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
On Wed, May 20, 2009 at 02:02:52PM +0200, Andrew Beekhof wrote: > Ah, well that was pretty obvious. > /me humbly apologizes for such a stupid error. Hi and thanks! no problem > (It wasn't caught by my own valgrind testing because this function is > specific to heartbeat based clusters) don't worry, I'm doing a lots of testing for you ;) I've already compiled it an deployed on testing machines, memory usage seems to be pretty low. I'll report few days later if everything is OK. thanks a lot once more! nik > > > Try this: > > diff -r ea5d0b58c0be cib/callbacks.c > --- a/cib/callbacks.c Wed May 20 11:56:39 2009 +0200 > +++ b/cib/callbacks.c Wed May 20 14:01:30 2009 +0200 > @@ -1064,6 +1064,7 @@ cib_ha_peer_callback(HA_Message * msg, v > { > xmlNode *xml = convert_ha_message(NULL, msg, __FUNCTION__); > cib_peer_callback(xml, private_data); > +free_xml(xml); > } > > void > > > > > On Tue, May 19, 2009 at 8:24 PM, Andrew Beekhof wrote: > > I'll take a look at the valgrind data. Thanks! > > > > On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich > > wrote: > >> Hello, > >> sorry to bother again. I've discovered why valgrind didn't > >> find anything. It is important to stop the process in order to > >> have valgrind finish the analysis. And it seems that there > >> really are leaks not only in cib, but also in attrd and crmd. > >> I just had a slight look into the code reported by valgrind > >> as problematic and though I would certainly need to examine > >> it much more to understand it properly, I think there are > >> leaks. I'm attaching the valgrind reports, In case You would be > >> interested in examining them. > >> If I could provide any help, I'll be more than happy. > >> (well, I guess I could of course help by sending patches :) but I'm > >> afraid this will take me a lot of time, I can try though). > >> with best regards > >> nik > >> > >>> Not really. Sorry :( > >>> > >> > >> -- > >> - > >> Nikola CIPRICH > >> LinuxBox.cz, s.r.o. > >> 28. rijna 168, 709 01 Ostrava > >> > >> tel.: +420 596 603 142 > >> fax: +420 596 621 273 > >> mobil: +420 777 093 799 > >> > >> www.linuxbox.cz > >> > >> mobil servis: +420 737 238 656 > >> email servis: ser...@linuxbox.cz > >> - > >> > > > ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
Ah, well that was pretty obvious. /me humbly apologizes for such a stupid error. (It wasn't caught by my own valgrind testing because this function is specific to heartbeat based clusters) Try this: diff -r ea5d0b58c0be cib/callbacks.c --- a/cib/callbacks.c Wed May 20 11:56:39 2009 +0200 +++ b/cib/callbacks.c Wed May 20 14:01:30 2009 +0200 @@ -1064,6 +1064,7 @@ cib_ha_peer_callback(HA_Message * msg, v { xmlNode *xml = convert_ha_message(NULL, msg, __FUNCTION__); cib_peer_callback(xml, private_data); +free_xml(xml); } void On Tue, May 19, 2009 at 8:24 PM, Andrew Beekhof wrote: > I'll take a look at the valgrind data. Thanks! > > On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich > wrote: >> Hello, >> sorry to bother again. I've discovered why valgrind didn't >> find anything. It is important to stop the process in order to >> have valgrind finish the analysis. And it seems that there >> really are leaks not only in cib, but also in attrd and crmd. >> I just had a slight look into the code reported by valgrind >> as problematic and though I would certainly need to examine >> it much more to understand it properly, I think there are >> leaks. I'm attaching the valgrind reports, In case You would be >> interested in examining them. >> If I could provide any help, I'll be more than happy. >> (well, I guess I could of course help by sending patches :) but I'm >> afraid this will take me a lot of time, I can try though). >> with best regards >> nik >> >>> Not really. Sorry :( >>> >> >> -- >> - >> Nikola CIPRICH >> LinuxBox.cz, s.r.o. >> 28. rijna 168, 709 01 Ostrava >> >> tel.: +420 596 603 142 >> fax: +420 596 621 273 >> mobil: +420 777 093 799 >> >> www.linuxbox.cz >> >> mobil servis: +420 737 238 656 >> email servis: ser...@linuxbox.cz >> - >> > ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
I'll take a look at the valgrind data. Thanks! On Tue, May 19, 2009 at 6:39 PM, Nikola Ciprich wrote: > Hello, > sorry to bother again. I've discovered why valgrind didn't > find anything. It is important to stop the process in order to > have valgrind finish the analysis. And it seems that there > really are leaks not only in cib, but also in attrd and crmd. > I just had a slight look into the code reported by valgrind > as problematic and though I would certainly need to examine > it much more to understand it properly, I think there are > leaks. I'm attaching the valgrind reports, In case You would be > interested in examining them. > If I could provide any help, I'll be more than happy. > (well, I guess I could of course help by sending patches :) but I'm > afraid this will take me a lot of time, I can try though). > with best regards > nik > >> Not really. Sorry :( >> > > -- > - > Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28. rijna 168, 709 01 Ostrava > > tel.: +420 596 603 142 > fax: +420 596 621 273 > mobil: +420 777 093 799 > > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > - > ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
Hello, sorry to bother again. I've discovered why valgrind didn't find anything. It is important to stop the process in order to have valgrind finish the analysis. And it seems that there really are leaks not only in cib, but also in attrd and crmd. I just had a slight look into the code reported by valgrind as problematic and though I would certainly need to examine it much more to understand it properly, I think there are leaks. I'm attaching the valgrind reports, In case You would be interested in examining them. If I could provide any help, I'll be more than happy. (well, I guess I could of course help by sending patches :) but I'm afraid this will take me a lot of time, I can try though). with best regards nik > Not really. Sorry :( > -- - Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - valgrind.tar.gz Description: GNU Zip compressed data ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
On Sat, May 16, 2009 at 10:33 PM, Nikola Ciprich wrote: > Hi guys, > I was able to enable valgrind on our production cluster today, > but unfortunately only on the secondary node, I'll be allowed to enable > it on primary node hopefully during next weekend. > Unfortunately it seems that valgrind probably won't be of much help here. > I've got some output from it, but it's only few warnings and it seems > that growing memory consumption is not really caused by leak, but (maybe) > only by some growing memory structure. I'm doing one not very nice thing > in my cluster which might be the culprit: > I'm monitoring some service by a cron script and periodically changing > related resource score by the following command: > > cibadmin -U -o constraints -X " > > role="Master"> > operation="eq" value="${host}"/> > > > Is it possible that this could be causing cib growing memory consumption? Anything is possible, but it would be unlikely. There's nothing special about that command that would make only it leak. > Anyways, I'm attaching valgrind output for cib process: > > ==14779== My PID = 14779, parent PID = 14766. Prog and args are: > ==14779== /usr/lib64/heartbeat/cib > ==14779== > Can this help? Not really. Sorry :( ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
Hi guys, I was able to enable valgrind on our production cluster today, but unfortunately only on the secondary node, I'll be allowed to enable it on primary node hopefully during next weekend. Unfortunately it seems that valgrind probably won't be of much help here. I've got some output from it, but it's only few warnings and it seems that growing memory consumption is not really caused by leak, but (maybe) only by some growing memory structure. I'm doing one not very nice thing in my cluster which might be the culprit: I'm monitoring some service by a cron script and periodically changing related resource score by the following command: cibadmin -U -o constraints -X " Is it possible that this could be causing cib growing memory consumption? Anyways, I'm attaching valgrind output for cib process: ==14779== My PID = 14779, parent PID = 14766. Prog and args are: ==14779==/usr/lib64/heartbeat/cib ==14779== ==14779== Conditional jump or move depends on uninitialised value(s) ==14779==at 0x674E354: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x674CDA5: (within /usr/lib64/libxml2.so.2.6.26) gz ==14779==by 0x674CD5E: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x674CD5E: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x674C77D: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x6751853: xmlXPathEvalExpression (in /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x4E3CB58: xpath_search (xml.c:2545) ==14779==by 0x50567BE: cib_process_xpath (cib_ops.c:880) ==14779==by 0x5053CB3: cib_process_query (cib_ops.c:49) ==14779==by 0x5057F3E: cib_perform_op (cib_utils.c:539) ==14779==by 0x40AFBD: cib_process_command (callbacks.c:843) ==14779==by 0x40A3FC: cib_process_request (callbacks.c:660) ==14779==by 0x408E7E: cib_common_callback_worker (callbacks.c:259) ==14779==by 0x4090EE: cib_common_callback (callbacks.c:315) ==14779==by 0x408C4C: cib_rw_callback (callbacks.c:206) ==14779==by 0x5E69858: G_CH_dispatch_int (GSource.c:624) ==14779==by 0x739FDB3: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73A2C0C: (within /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73A2F19: g_main_loop_run (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x40D3F0: cib_init (main.c:508) │ ││ │ │ ==14779==by 0x40C8AE: main (main.c:217) ==14779== Conditional jump or move depends on uninitialised value(s) ==14779==at 0x674E354: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x674CDA5: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x674C77D: (within /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x6751853: xmlXPathEvalExpression (in /usr/lib64/libxml2.so.2.6.26) ==14779==by 0x4E3CB58: xpath_search (xml.c:2545) ==14779==by 0x50567BE: cib_process_xpath (cib_ops.c:880) ==14779==by 0x5053CB3: cib_process_query (cib_ops.c:49) ==14779==by 0x5057F3E: cib_perform_op (cib_utils.c:539) ==14779==by 0x40AFBD: cib_process_command (callbacks.c:843) ==14779==by 0x40A3FC: cib_process_request (callbacks.c:660) ==14779==by 0x408E7E: cib_common_callback_worker (callbacks.c:259) ==14779==by 0x4090EE: cib_common_callback (callbacks.c:315) ==14779==by 0x408C4C: cib_rw_callback (callbacks.c:206) ==14779==by 0x5E69858: G_CH_dispatch_int (GSource.c:624) ==14779==by 0x739FDB3: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73A2C0C: (within /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73A2F19: g_main_loop_run (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x40D3F0: cib_init (main.c:508) ==14779==by 0x40C8AE: main (main.c:217) ==14779== Syscall param unlink(pathname) points to uninitialised byte(s) ==14779==at 0x6ACCC27: unlink (in /lib64/libc-2.5.so) ==14779==by 0x5E6BBC5: socket_destroy_channel (ipcsocket.c:870) ==14779==by 0x5E6780A: G_CH_destroy_int (GSource.c:677) ==14779==by 0x739F74C: (within /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x739FEB9: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73A2C0C: (within /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73A2F19: g_main_loop_run (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x40D3F0: cib_init (main.c:508) ==14779==by 0x40C8AE: main (main.c:217) ==14779== Address 0x4092A72 is 2 bytes inside a block of size 110 alloc'd ==14779==at 0x4C20809: malloc (vg_replace_malloc.c:149) ==14779==by 0x73A6BFA: g_malloc (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x5E6B7AE: socket_accept_connection (ipcsocket.c:708) ==14779==by 0x5E69364: G_WC_dispatch (GSource.c:830) ==14779==by 0x739FDB3: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.1200.3) ==14779==by 0x73
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
Hi guys, sooo I've got valgrind grinding:) I had some trouble getting the latest stuff working, so I used heartbeat-2.99.2 with Dejan's (fixed) patch and --enable-valgrind --with-valgrind-log="--log-file=/tmp/crm-%p.valgrind" and recompiled pacemaker-1.0.3 (withount openais as Andrew suggested). now enabling valgrind works! Unfortulately I don't see the leaks on my testing machine, so I'll have to try it directly on production one. Hopefully I'll have some time for playing Tomorrow or during the weekend, so I'll report ASAP. thanks a lot for all Your help! best regards nik On Thu, May 14, 2009 at 04:12:52PM +0200, Andrew Beekhof wrote: > On Thu, May 14, 2009 at 3:58 PM, Nikola Ciprich > wrote: > > Hi, > > Dejan, thanks a lot, I compiled Your version, but crmd with shipped > > pacemaker keeps segfaulting > > with it, and unable to rebuild pacemaker with this heartbeat to get the > > -debug package. > > compilation fails with: > > > > plugin.c: In function 'check_message_sanity': > > plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has > > type 'long unsigned int' > > plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has > > type 'long unsigned int' > > gmake[2]: *** [plugin.lo] Error 1 > > gmake[2]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib/ais' > > gmake[1]: *** [all-recursive] Error 1 > > gmake[1]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib' > > make: *** [all-recursive] Error 1 > > error: Bad exit status from /var/tmp/rpm-tmp.81431 (%build) > > > > Could You please send me only the related patch, so I could try compiling > > latest stable > > version? I don't see it in the mercurial... > > When you configure pacemaker, just add the --without-ais option. > > > > > Andrew thanks for Your patches as well, I'll try them, but honestly I'm a > > bit confused, > > first patch is for heartbeat, right? > > actually, you probably dont need the second one. i think its in 1.0 already. > -- - Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
On Thu, May 14, 2009 at 3:58 PM, Nikola Ciprich wrote: > Hi, > Dejan, thanks a lot, I compiled Your version, but crmd with shipped pacemaker > keeps segfaulting > with it, and unable to rebuild pacemaker with this heartbeat to get the > -debug package. > compilation fails with: > > plugin.c: In function 'check_message_sanity': > plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has > type 'long unsigned int' > plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has > type 'long unsigned int' > gmake[2]: *** [plugin.lo] Error 1 > gmake[2]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib/ais' > gmake[1]: *** [all-recursive] Error 1 > gmake[1]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib' > make: *** [all-recursive] Error 1 > error: Bad exit status from /var/tmp/rpm-tmp.81431 (%build) > > Could You please send me only the related patch, so I could try compiling > latest stable > version? I don't see it in the mercurial... When you configure pacemaker, just add the --without-ais option. > > Andrew thanks for Your patches as well, I'll try them, but honestly I'm a bit > confused, > first patch is for heartbeat, right? actually, you probably dont need the second one. i think its in 1.0 already. ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
Hi, Dejan, thanks a lot, I compiled Your version, but crmd with shipped pacemaker keeps segfaulting with it, and unable to rebuild pacemaker with this heartbeat to get the -debug package. compilation fails with: plugin.c: In function 'check_message_sanity': plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has type 'long unsigned int' plugin.c:1190: warning: format '%d' expects type 'int', but argument 10 has type 'long unsigned int' gmake[2]: *** [plugin.lo] Error 1 gmake[2]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib/ais' gmake[1]: *** [all-recursive] Error 1 gmake[1]: Leaving directory `/home/src/redhat/BUILD/pacemaker/lib' make: *** [all-recursive] Error 1 error: Bad exit status from /var/tmp/rpm-tmp.81431 (%build) Could You please send me only the related patch, so I could try compiling latest stable version? I don't see it in the mercurial... Andrew thanks for Your patches as well, I'll try them, but honestly I'm a bit confused, first patch is for heartbeat, right? and the second one for pacemaker? It doesn't seem to apply either to -tip, or to 1.0.3... BR nik -- - Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
On Wed, May 13, 2009 at 7:41 PM, Dejan Muhamedagic wrote: > Hi, > > On Wed, May 13, 2009 at 05:36:40PM +0200, Nikola Ciprich wrote: >> > holy ! >> yes! exactly! :) >> >> > sure >> > in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf >> >> hmm, i tried that now, but all I got is: >> May 13 16:46:16 faxb heartbeat: [1655]: ERROR: Heartbeat was not compiled >> with --enable-libc-malloc, "crm valgrind" is therefor not supported. >> >> So I wanted to compile myself, but I see this option neither in >> pacemaker's configure, nor in heartbeat's. But I noticed >> --enable-valgrind option for heartbeat configure, >> but enabling it and recompiling the heartbeat didn't help. so >> maybe this part needs some updating? > > Looks like it. Just pushed a patch for that. Can you try again > with the new tarball: > > http://hg.linux-ha.org/dev/archive/6467be4d4cb7.tar.bz2 Thanks Dejan! Nikola, I also suggest the following two patches diff -r 4038c4644964 configure.in --- a/configure.in Wed May 13 17:07:22 2009 +0200 +++ b/configure.in Wed May 13 20:48:05 2009 +0200 @@ -2799,17 +2799,14 @@ AC_ARG_WITH(valgrind-suppress, [ VALGRIND_SUPP="/dev/null" ]) if test "x" = "x$VALGRIND_LOG"; then -VALGRIND_LOG="--log-socket=127.0.0.1:1234" -AC_MSG_NOTICE(Set default Valgrind options to: $VALGRIND_OPTS) -AC_MSG_NOTICE(Remember to start a receiver on localhost:1234) +VALGRIND_LOG="--log-file=/tmp/crm-%p.valgrind" fi -AC_PATH_PROG(VALGRIND_BIN, valgrind) if test "xyes" = "x$enable_valgrind" -a "x$VALGRIND_BIN" != "x"; then enable_libc_malloc=yes fi -AC_DEFINE_UNQUOTED(VALGRIND_BIN, "$VALGRIND_BIN", Valgrind command) +AC_DEFINE_UNQUOTED(VALGRIND_BIN, "valgrind", Valgrind command) AC_DEFINE_UNQUOTED(VALGRIND_LOG, "$VALGRIND_LOG", Valgrind logging options) AC_DEFINE_UNQUOTED(VALGRIND_SUPP, "$VALGRIND_SUPP", Name of a suppression file to pass to Valgrind) diff -r 4038c4644964 crm/crmd/subsystems.c --- a/crm/crmd/subsystems.c Wed May 13 17:07:22 2009 +0200 +++ b/crm/crmd/subsystems.c Wed May 13 20:48:05 2009 +0200 @@ -148,6 +148,7 @@ start_subsystem(struct crm_subsystem_s* unsigned intj; struct rlimit oflimits; const char *devnull = "/dev/null"; + const char *grind = getenv("HA_VALGRIND_ENABLED"); crm_info("Starting sub-system \"%s\"", the_subsystem->name); set_bit_inplace(fsa_input_register, the_subsystem->flag_required); @@ -211,7 +212,8 @@ start_subsystem(struct crm_subsystem_s* (void)open(devnull, O_WRONLY); /* Stdout: fd 1 */ (void)open(devnull, O_WRONLY); /* Stderr: fd 2 */ - if(getenv("HA_VALGRIND_ENABLED") != NULL) { + if(grind != NULL + && (crm_is_true(grind) || strstr(grind, the_subsystem->name))) { char *opts[] = { crm_strdup(VALGRIND_BIN), crm_strdup("--show-reachable=yes"), crm_strdup("--leak-check=full"), ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
Hi, On Wed, May 13, 2009 at 05:36:40PM +0200, Nikola Ciprich wrote: > > holy ! > yes! exactly! :) > > > sure > > in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf > > hmm, i tried that now, but all I got is: > May 13 16:46:16 faxb heartbeat: [1655]: ERROR: Heartbeat was not compiled > with --enable-libc-malloc, "crm valgrind" is therefor not supported. > > So I wanted to compile myself, but I see this option neither in > pacemaker's configure, nor in heartbeat's. But I noticed > --enable-valgrind option for heartbeat configure, > but enabling it and recompiling the heartbeat didn't help. so > maybe this part needs some updating? Looks like it. Just pushed a patch for that. Can you try again with the new tarball: http://hg.linux-ha.org/dev/archive/6467be4d4cb7.tar.bz2 Thanks, Dejan > BR > nik > > > > > > did this not work? > > > > ___ > > Pacemaker mailing list > > Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > -- > - > Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28. rijna 168, 709 01 Ostrava > > tel.: +420 596 603 142 > fax:+420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > - > > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
> holy ! yes! exactly! :) > sure > in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf hmm, i tried that now, but all I got is: May 13 16:46:16 faxb heartbeat: [1655]: ERROR: Heartbeat was not compiled with --enable-libc-malloc, "crm valgrind" is therefor not supported. So I wanted to compile myself, but I see this option neither in pacemaker's configure, nor in heartbeat's. But I noticed --enable-valgrind option for heartbeat configure, but enabling it and recompiling the heartbeat didn't help. so maybe this part needs some updating? BR nik > > did this not work? > > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > -- - Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] cib still leaks in pacemaker-1.0.3
On May 13, 2009, at 8:28 AM, Nikola Ciprich wrote: Hello, I've reported this some time ago, few days ago I've updated my system to pacemaker-1.0.3 + related packages. But unfortunately cib process seems to be still leaking,ie it's RSS memory usage is constantly growing. This means we have to restart whole heartbeat service approximately once every two weeks as the memory usage of cib process gets to ~1.5GB. holy ! Some time ago when I was trying to use valgrind, I had some trouble, Andrew, You wrote that You're mostly testing openais variant, and it's possible that heartbeat has some problems being started with valgrind. could You please help me with running the cib process with valgrind so I could provide more accurate repport? sure in theory you can just add "crm valgrind" instead of "crm yes" in ha.cf did this not work? ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
[Pacemaker] cib still leaks in pacemaker-1.0.3
Hello, I've reported this some time ago, few days ago I've updated my system to pacemaker-1.0.3 + related packages. But unfortunately cib process seems to be still leaking,ie it's RSS memory usage is constantly growing. This means we have to restart whole heartbeat service approximately once every two weeks as the memory usage of cib process gets to ~1.5GB. Some time ago when I was trying to use valgrind, I had some trouble, Andrew, You wrote that You're mostly testing openais variant, and it's possible that heartbeat has some problems being started with valgrind. could You please help me with running the cib process with valgrind so I could provide more accurate repport? thank You very much in advance. best regards nik -- - Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker