Hi, maybe attached patch (based on 3.0.4) can fix the leak. The daemon runs and reports metrics. It is of course to early to say.
When looking at the linux metrics file, I just realized hom much code duplication there is. Basically all funtion-groups that grok the same /proc/xxx files should be rewritten to use common code. This ist true for cpu, load and network. Maybe others. Cheers Martin ------------------------------------------------------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de ----- Original Message ---- > From: Martin Knoblauch <[EMAIL PROTECTED]> > To: Kumar Vaibhav <[EMAIL PROTECTED]>; Carlo Marcelo Arenas Belon <[EMAIL > PROTECTED]> > Cc: ganglia-developers@lists.sourceforge.net > Sent: Thursday, February 14, 2008 11:36:37 AM > Subject: Re: [Ganglia-developers] Memory leak in gmond > > Hi, > > after looking at one of my employerss customers installations, it definitely > seems that metrics-collecting/non-mute "gmond"s are growing (substantially) > over > time. Pure listeners seem to be unaffected. > > If I remember correctly, Kumars valgrind traces found that "strndup" might > allocate later leaked memory. If I look at the 3.0.4 > libmetrics/linux/metrics.c > I have the strong feeling that all four network functions are careless about > the > memory allocated by strndup: > > 217: char *devname, *src; > 228: devname = strndup(src, n); > 238: net_dev_stats *ns = hash_lookup(devname, 1, > > 305: char *devname, *src; > 316: devname = strndup(src, n); > 326: net_dev_stats *ns = hash_lookup(devname, 1, > > 393: char *devname, *src; > 404: devname = strndup(src, n); > 414: net_dev_stats *ns = hash_lookup(devname, 1, > > 481: char *devname, *src; > 492: devname = strndup(src, n); > 502: net_dev_stats *ns = hash_lookup(devname, 1, > > > Have to look at it some more. > > Cheers > Martin > ------------------------------------------------------ > Martin Knoblauch > email: k n o b i AT knobisoft DOT de > www: http://www.knobisoft.de > > ----- Original Message ---- > > From: Kumar Vaibhav > > To: Carlo Marcelo Arenas Belon > > Cc: ganglia-developers@lists.sourceforge.net > > Sent: Saturday, February 9, 2008 8:59:18 AM > > Subject: Re: [Ganglia-developers] Memory leak in gmond > > > > Carlo Marcelo Arenas Belon wrote: > > > On Tue, Jan 22, 2008 at 04:17:07PM +0530, Kumar Vaibhav wrote: > > >> I am using ganglia-3.0.5 on a woodcrest processor cluster. and I see > > >> that after running for weeks the memory consumption of the gmond process > > >> is something about 400 MB. > > > > > > did you check what was the size 1 hour after all gmond proceses in your > > > cluster were started?, if you are using multicast and have a large number > > > of > > > nodes/metrics then that is the ammount of memory that is needed to hold > > > all > > > those metrics from all nodes most likely. > > I Checked it . The memory size increases with Time. i Tried ps -eo > > cmd,rss and can see the size of gmond increases with time. > > > > > >> ==2381== LEAK SUMMARY: > > >> ==2381== definitely lost: 69 bytes in 16 blocks. > > >> ==2381== possibly lost: 0 bytes in 0 blocks. > > > > > > that means there is no memory leak (execpt for 69 bytes) > > This is so because I had run it for few minutes only. > > > > > >> ==2381== still reachable: 1,446,276 bytes in 1,463 blocks. > > > > > > that is the RSS of your process > > by memory I mean RSS only. > > > > > > Here are some new tests I have done. > > > > I isolated two nodes of the cluster by changing their multicast address. > > On one I run gmond in mute mode and on one in deaf mode. The RSS of > > gmond in deaf node continues to increase. But the RSS of gmond on mute > > mode stablises after some. time. And it didn't increase for a week. > > > > Hope this will help you to solve the problem. > > > > > > Carlo > > > > Vaibhav > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Ganglia-developers mailing list > > Ganglia-developers@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > > > > > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Ganglia-developers mailing list > Ganglia-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-developers > >
linux-metrics.diff
Description: Binary data
------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers