Re: [Ganglia-general] Ganglia gmond memory leak?

2012-03-23 Thread Wiebalk, John
I ran a test on two new systems, one with the modules commented out and one 
with the modules running. The one without the modules grew from 3MB - 49MB mem 
usage in 3 days and the one with the modules grew from 3MB to 50MB in 3 days. 
These were two freshly configured lpars with nothing else running on them. AIX 
6.1 Tl5 SP6.

Is there a precompiled binary for valgrind on AIX? We try not to install 
compilers on our systems for security. However if not available I can install 
it on a lab system to run with gmond.


Thanks

--
John Wiebalk
Operating System Engineer
UNIX | Enterprise Technology Infrastructure
Phone: 412-647-3881
Email: wie...@upmc.edumailto:wie...@upmc.edu

From: Wiebalk, John
Sent: Monday, March 12, 2012 1:47 PM
To: 'Ganglia-general@lists.sourceforge.net'
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?

We are also experiencing this issue at our site. We are running Ganglia 3.2 on 
AIX. We recently upgrade from 3.0.7 and started experiencing this issue. We 
used the rpm / ibm metrics from http://www.perzl.org/ganglia/

Has anyone test to see if this issue still exists in a new version of ganglia?


--
John Wiebalk
Operating System Engineer
UNIX | Enterprise Technology Infrastructure

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-03-23 Thread Michael Perzl



On 03/23/2012 12:34 PM, Wiebalk, John wrote:


I ran a test on two new systems, one with the modules commented out 
and one with the modules running. The one without the modules grew 
from 3MB - 49MB mem usage in 3 days and the one with the modules grew 
from 3MB to 50MB in 3 days. These were two freshly configured lpars 
with nothing else running on them. AIX 6.1 Tl5 SP6.


Hmm, so this is not really attributed then to the additional gmond 
modules but rather a gmond issue.


Is there a precompiled binary for valgrind on AIX? We try not to 
install compilers on our systems for security. However if not 
available I can install it on a lab system to run with gmond.


Unfortunately there seems to be no precompiled binary for valgrind on 
AIX available. I am just trying to get it compiled on AIX and will keep 
you updated of my progress the next couple of days.


Regards ,
Michael
--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-03-20 Thread David Birdsong
My gmond's always bloat quite large. To combat this, I've dedicated a
single host with massive amount of swap to run a gmonds as a
'collector' role. Every time I add a new cluster, I add a new gmond
instance on the collector box with a new port.

The recent discussion about setting dmax caused me to dig and find
that virtually all of my custom python library metrics are setting
dmax to zero which I think is the cause of this bloat.

I've failed so far at determining how to set dmax and I haven't had
much luck asking this list so far.

dmax? how do you set it?

On Tue, Mar 20, 2012 at 9:37 AM, Michael Perzl mich...@perzl.org wrote:
 Can you please try the following if possible:

 1) Run gmond without any additional modules and check if it is still
 leaking memory.
 -- This test would exclude - if gmond is then still leaking memory -
 any additional gmond module as the culprit for the memory leak.

 2) Do you have the chance to run gmond in the foreground for some time
 under some tool like valgrind or Purify?

 Regards,
 Michael

 On 03/20/2012 09:27 AM, Florian Munz wrote:
 yes, it's also happening for me on 3.3.1

 Anyone know how to move forward with this? I'd consider this a quite
 serious issue.


 Cheers,
 Florian

 On 12.03.12 18:46, Wiebalk, John wrote:
 We are also experiencing this issue at our site. We are running Ganglia
 3.2 on AIX. We recently upgrade from 3.0.7 and started experiencing this
 issue. We used the rpm / ibm metrics from http://www.perzl.org/ganglia/

 Has anyone test to see if this issue still exists in a new version of
 ganglia?

 --
 John Wiebalk

 Operating System Engineer

 UNIX | Enterprise Technology Infrastructure



 --
 Try before you buy = See our experts in action!
 The most comprehensive online learning library for Microsoft developers
 is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
 Metro Style Apps, more. Free future releases when you subscribe now!
 http://p.sf.net/sfu/learndevnow-dev2



 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general
 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here
 http://p.sf.net/sfu/sfd2d-msazure
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here
 http://p.sf.net/sfu/sfd2d-msazure
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-03-20 Thread Seth T Graham

On Mar 20, 2012, at 12:02 PM, David Birdsong wrote:

 My gmond's always bloat quite large. To combat this, I've dedicated a
 single host with massive amount of swap to run a gmonds as a
 'collector' role. Every time I add a new cluster, I add a new gmond
 instance on the collector box with a new port.
 
 The recent discussion about setting dmax caused me to dig and find
 that virtually all of my custom python library metrics are setting
 dmax to zero which I think is the cause of this bloat.
 
 I've failed so far at determining how to set dmax and I haven't had
 much luck asking this list so far.
 
 dmax? how do you set it?

Ganglia keeps three values for each metric submitted by a client: TMAX, TN, and 
DMAX.

TMAX, as far as users are concerned, is informational. It indicates the 
interval at which ganglia expects new values to be submitted by a host. 

TN indicates the number of seconds since a metric was last updated. When TN is 
bigger than TMAX, ganglia is waiting to store new data.

DMAX indicates how long old metrics should linger. If TN exceeds this number, 
ganglia will stop showing graphs for that metric.


So, set DMAX to an interval equal to when you no longer care about a metric 
being reported on the web page. Setting it to zero tells ganglia to never 
consider it expired.

Zero is appropriate for most stuff and in a default gmond install that's what 
you'll get. I would assume the only time one wants to set DMAX is in situations 
where the NAME attribute in a metric changes frequently (which I've seen, the 
torque PBS module does this). If you don't, your xml tree will quickly fill up 
with ancient data.



--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-03-01 Thread David Birdsong
On Thu, Feb 23, 2012 at 11:06 AM, Matt Massie m...@massie.us wrote:
 Each unique metric (keyed on metric name) requires memory space in gmond.

 A good test is to peek at the number of metrics in gmond over time, e.g.

 $ telnet localhost 8649 | grep METRIC | wc -l

 If the number of metrics over time increases, so will the memory use.

 Ganglia will release the metric and memory when the age of the metric is
 greater than DMAX.  A DMAX value of zero will cause ganglia to hold the
 metric indefinitely.  In order to make sure that ganglia is releasing old
 metrics, set the DMAX value to something like 5 minutes (300 secs).


aside from metrics originating from gmetric, where does one set dmax?

i can't find any reference on how to set it and my understanding of
gmond.conf is that host_dmax != dmax.

 For example, lets assume you are doing per process monitoring and the metric
 name looks like

 cpu_user.%d % (pid,)

 Over time, you'll have lots of metrics (cpu_user.343493, cpu_user.343022,
 cpu_user.232323) that start accumulating and taking up memory space.

 -Matt



 On Thu, Feb 23, 2012 at 10:01 AM, svd.gang...@mylife.com wrote:

 i observed this in the past as well.  running valgrind for days did not
 yeild any clue.  i had a hunch that remote spoofed metrics were involved,
 as the leak seemed to get better when i had coincidentally disabled the
 sending of some of those spoof metrics.  but, we never found anything
 conclusive.  there was also some odd race such that sometimes after
 restart the leak was much faster, but after restarting a few times the
 leak slowed (but was always still fast enough to be a burden).

 -scott

  From: Aidan Wong aidanw...@attinteractive.com
  To: Ave-Lallemant, Nathan P nathan.p.ave-lallem...@efleets.com;
  ganglia-general ganglia-general@lists.sourceforge.net
  Sent: Thursday, February 23, 2012 8:34 AM
  Subject: Re: [Ganglia-general] Ganglia gmond memory leak?
 
 
  I've restarted the gmond process and memory usage drops until gmond
  hogs memory over time. ?Any Ganglia contributors who may want to chime in 
  on
  this memory leak issue? ?I'm on Ganglia 3.2.0. ?Are there any improvements
  on version 3.3.1 addressing this issue?
 
 
  Thanks


 --
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general



 --
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-27 Thread Chris Burroughs
I've also observed this and have been unable to find a solution.  In my
case at least there was no obvious correlation with the number of
metrics or weather the gmond was an aggregating or not (so several
orders of magnitude in the number of metrics did not matter, it might
happen on 2 out of 80 nodes). gmond would take up memory  physical RAM,
swap, and general sadness.

I'm unfortunately not able to provide further information since we went
to nightly gmond restarts as a work around.

On 02/22/2012 05:10 PM, Aidan Wong wrote:
 Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The 
 amount of resident used memory that the process uses, gets up pretty high and 
 keeps increasing.
 
 USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
 root 18647  0.0  9.9 2965464 1836268 ? Ss   Jan14  11:24 
 /home/t/hadoop-ganglia-client/sbin/gmond -c 
 /home/t/hadoop-ganglia-client/gmond.conf -p 
 /home/t/hadoop-ganglia-client/logs/gmond.pid
 
 Is this a bug?  Can anyone suggest a solution?
 
 Thank you
 
 
 
 
 --
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing 
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 
 
 
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-27 Thread Martin Knoblauch
Hi Aidan,

 for what it is worth, I cannot reproduce the growing memory consumption on a 
small 3.2.0 grid using only standard metrics in unicast mode. Running now for a 
few hours. Will check again tomorrow.

Cheers

Martin 

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de



 From: Aidan Wong aidanw...@attinteractive.com
To: Ave-Lallemant, Nathan P nathan.p.ave-lallem...@efleets.com; 
ganglia-general ganglia-general@lists.sourceforge.net 
Sent: Thursday, February 23, 2012 8:34 AM
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?
 

I've restarted the gmond process and memory usage drops until gmond hogs 
memory over time.  Any Ganglia contributors who may want to chime in on this 
memory leak issue?  I'm on Ganglia 3.2.0.  Are there any improvements on 
version 3.3.1 addressing this issue?


Thanks

From: Ave-Lallemant, Nathan P nathan.p.ave-lallem...@efleets.com
Date: Wed, 22 Feb 2012 16:31:58 -0600
To: Aidan Wong aidanw...@attinteractive.com, ganglia-general 
ganglia-general@lists.sourceforge.net
Subject: RE: Ganglia gmond memory leak?



 
I have seen the same behavior in my environment but do not have a solution.
 
 
Nathan


 
From:Aidan Wong [mailto:aidanw...@attinteractive.com] 
Sent: Wednesday, February 22, 2012 4:10 PM
To: ganglia-general
Subject: [Ganglia-general] Ganglia gmond memory leak?
 
Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The 
amount of resident used memory that the process uses, gets up pretty high and 
keeps increasing.
 
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     18647  0.0  9.9 2965464 1836268 ?     Ss   Jan14  11:24 
/home/t/hadoop-ganglia-client/sbin/gmond -c 
/home/t/hadoop-ganglia-client/gmond.conf -p 
/home/t/hadoop-ganglia-client/logs/gmond.pid
 
Is this a bug?  Can anyone suggest a solution?
 
Thank you

 CONFIDENTIALITY NOTICE: This e-mail and any files transmitted with it are 
 intended solely for the use of the individual or entity to whom they are 
 addressed and may contain confidential and privileged information protected 
 by law. If you received this e-mail in error, any review, use, dissemination, 
 distribution, or copying of the e-mail is strictly prohibited. Please notify 
 the sender immediately by return e-mail and delete all copies from your 
 system.


 
--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-24 Thread Aidan Wong
I'm not using any IIRC plugins as far as I know.  I'm using basically
Ganglia 3.2.0 right out of the box.  The extra metrics that I'm sending
are from my Hadoop cluster nodes where I defined the host and gmond port
of the destination gmond that collects the metrics.

On 2/23/12 6:27 PM, Robin Humble robin.humble+gang...@anu.edu.au wrote:

On Thu, Feb 23, 2012 at 07:22:36PM +, Aidan Wong wrote:
That one node that recently had the running away memory leak was sending
253 metrics.  I'm using unicast sending all metrics to a specific host
where I have configured the udp_send_channel with the host and port
attributes defined.

IIRC plugins are loaded once and then run within gmond's address space.
so I guess plugins could be causing memory leaks.
which plugins are you using? they alloc/free as they should?

we haven't notived any leaks (certainly no serious leaks) across our
~1800 gmonds using 3.2.0, but we aren't sending that many metrics
either - just using most of the standard stuff plus modified diskstat,
cputemp python plugins, and with a bunch of other metrics spoof'd from
chassis and switches (more cpu cycles for HPC job this way).
we are using multicast.
all except a few gmonds (not included below) are senders only.

 %CPU %MEMVSZ   RSS  COMMAND
min   0.0  0.0  70972  1864 /usr/sbin/gmond
median0.0  0.0  70972  3392 /usr/sbin/gmond
ave   0.0  0.0  70977  3240 /usr/sbin/gmond
max   0.0  0.0  71104  4720 /usr/sbin/gmond

those with the larger RSS have been rebooted recently and haven't yet
had unused pages pushed out by vm pressure.

cheers,
robin
--
Dr Robin Humble, HPC Systems Analyst, NCI National Facility

--

Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general




--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-24 Thread Aidan Wong
I have the following config in regards to metric cleanup:

  host_dmax =  259200 /*secs - 3 days*/
  cleanup_threshold = 300 /*secs */


From: Matt Massie m...@massie.usmailto:m...@massie.us
Date: Thu, 23 Feb 2012 11:06:03 -0800
To: svd.gang...@mylife.commailto:svd.gang...@mylife.com
Cc: 
ganglia-general@lists.sourceforge.netmailto:ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?

Each unique metric (keyed on metric name) requires memory space in gmond.

A good test is to peek at the number of metrics in gmond over time, e.g.

$ telnet localhost 8649 | grep METRIC | wc -l

If the number of metrics over time increases, so will the memory use.

Ganglia will release the metric and memory when the age of the metric is 
greater than DMAX.  A DMAX value of zero will cause ganglia to hold the metric 
indefinitely.  In order to make sure that ganglia is releasing old metrics, set 
the DMAX value to something like 5 minutes (300 secs).

For example, lets assume you are doing per process monitoring and the metric 
name looks like

cpu_user.%d % (pid,)

Over time, you'll have lots of metrics (cpu_user.343493, cpu_user.343022, 
cpu_user.232323) that start accumulating and taking up memory space.

-Matt


On Thu, Feb 23, 2012 at 10:01 AM, 
svd.gang...@mylife.commailto:svd.gang...@mylife.com wrote:
i observed this in the past as well.  running valgrind for days did not
yeild any clue.  i had a hunch that remote spoofed metrics were involved,
as the leak seemed to get better when i had coincidentally disabled the
sending of some of those spoof metrics.  but, we never found anything
conclusive.  there was also some odd race such that sometimes after
restart the leak was much faster, but after restarting a few times the
leak slowed (but was always still fast enough to be a burden).

-scott

 From: Aidan Wong 
 aidanw...@attinteractive.commailto:aidanw...@attinteractive.com
 To: Ave-Lallemant, Nathan P 
 nathan.p.ave-lallem...@efleets.commailto:nathan.p.ave-lallem...@efleets.com;
  ganglia-general 
 ganglia-general@lists.sourceforge.netmailto:ganglia-general@lists.sourceforge.net
 Sent: Thursday, February 23, 2012 8:34 AM
 Subject: Re: [Ganglia-general] Ganglia gmond memory leak?


 I've restarted the gmond process and memory usage drops until gmond hogs 
 memory over time. ?Any Ganglia contributors who may want to chime in on this 
 memory leak issue? ?I'm on Ganglia 3.2.0. ?Are there any improvements on 
 version 3.3.1 addressing this issue?


 Thanks

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.netmailto:Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

-- 
Virtualization  Cloud Management Using Capacity Planning Cloud computing makes 
use of virtualization - but cloud computing also focuses on allowing computing 
to be delivered as a service. 
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
 Ganglia-general mailing list 
Ganglia-general@lists.sourceforge.netmailto:Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general
--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread Martin Knoblauch
Hi Aidan,

 if possible for you, I would suggest running the gmond in foreground under 
the control of valgrind or a similar tool. Send us the report generated by 
the tool.

Cheers

Martin 

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de



 From: Aidan Wong aidanw...@attinteractive.com
To: Ave-Lallemant, Nathan P nathan.p.ave-lallem...@efleets.com; 
ganglia-general ganglia-general@lists.sourceforge.net 
Sent: Thursday, February 23, 2012 8:34 AM
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?
 

I've restarted the gmond process and memory usage drops until gmond hogs 
memory over time.  Any Ganglia contributors who may want to chime in on this 
memory leak issue?  I'm on Ganglia 3.2.0.  Are there any improvements on 
version 3.3.1 addressing this issue?


Thanks

From: Ave-Lallemant, Nathan P nathan.p.ave-lallem...@efleets.com
Date: Wed, 22 Feb 2012 16:31:58 -0600
To: Aidan Wong aidanw...@attinteractive.com, ganglia-general 
ganglia-general@lists.sourceforge.net
Subject: RE: Ganglia gmond memory leak?



 
I have seen the same behavior in my environment but do not have a solution.
 
 
Nathan


 
From:Aidan Wong [mailto:aidanw...@attinteractive.com] 
Sent: Wednesday, February 22, 2012 4:10 PM
To: ganglia-general
Subject: [Ganglia-general] Ganglia gmond memory leak?
 
Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The 
amount of resident used memory that the process uses, gets up pretty high and 
keeps increasing.
 
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     18647  0.0  9.9 2965464 1836268 ?     Ss   Jan14  11:24 
/home/t/hadoop-ganglia-client/sbin/gmond -c 
/home/t/hadoop-ganglia-client/gmond.conf -p 
/home/t/hadoop-ganglia-client/logs/gmond.pid
 
Is this a bug?  Can anyone suggest a solution?
 
Thank you

 CONFIDENTIALITY NOTICE: This e-mail and any files transmitted with it are 
 intended solely for the use of the individual or entity to whom they are 
 addressed and may contain confidential and privileged information protected 
 by law. If you received this e-mail in error, any review, use, dissemination, 
 distribution, or copying of the e-mail is strictly prohibited. Please notify 
 the sender immediately by return e-mail and delete all copies from your 
 system.


 
--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread Jesse Becker
How many metrics are you monitoring?  gmond must allocated memory for
each metric, from each host.  If you are using multicast, each gmond
instance will get metrics from all other instances.

If you run gmond in isolation--no traffic to/from other gmond
instances--does memory usage still go up?

On Wed, Feb 22, 2012 at 17:10, Aidan Wong aidanw...@attinteractive.com wrote:
 Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The
 amount of resident used memory that the process uses, gets up pretty high
 and keeps increasing.

 USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
 root     18647  0.0  9.9 2965464 1836268 ?     Ss   Jan14  11:24
 /home/t/hadoop-ganglia-client/sbin/gmond -c
 /home/t/hadoop-ganglia-client/gmond.conf -p
 /home/t/hadoop-ganglia-client/logs/gmond.pid

 Is this a bug?  Can anyone suggest a solution?

 Thank you

 --
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general




-- 
Jesse Becker

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread Martin Knoblauch
Hi Jesse,

 but in that case the memory footprint of gmond would approach a maximum 
after some time - correct? Aidan did not say whether it grows forever or goes 
asymptotic. Aidan?

 
Cheers
Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de



 From: Jesse Becker haw...@gmail.com
To: Aidan Wong aidanw...@attinteractive.com 
Cc: ganglia-general ganglia-general@lists.sourceforge.net 
Sent: Thursday, February 23, 2012 2:36 PM
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?
 
How many metrics are you monitoring?  gmond must allocated memory for
each metric, from each host.  If you are using multicast, each gmond
instance will get metrics from all other instances.

If you run gmond in isolation--no traffic to/from other gmond
instances--does memory usage still go up?

On Wed, Feb 22, 2012 at 17:10, Aidan Wong aidanw...@attinteractive.com wrote:
 Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The
 amount of resident used memory that the process uses, gets up pretty high
 and keeps increasing.

 USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
 root     18647  0.0  9.9 2965464 1836268 ?     Ss   Jan14  11:24
 /home/t/hadoop-ganglia-client/sbin/gmond -c
 /home/t/hadoop-ganglia-client/gmond.conf -p
 /home/t/hadoop-ganglia-client/logs/gmond.pid

 Is this a bug?  Can anyone suggest a solution?

 Thank you

 --
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general




-- 
Jesse Becker

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread svd.ganglia
i observed this in the past as well.  running valgrind for days did not 
yeild any clue.  i had a hunch that remote spoofed metrics were involved, 
as the leak seemed to get better when i had coincidentally disabled the 
sending of some of those spoof metrics.  but, we never found anything 
conclusive.  there was also some odd race such that sometimes after 
restart the leak was much faster, but after restarting a few times the 
leak slowed (but was always still fast enough to be a burden).

-scott

 From: Aidan Wong aidanw...@attinteractive.com
 To: Ave-Lallemant, Nathan P nathan.p.ave-lallem...@efleets.com; 
 ganglia-general ganglia-general@lists.sourceforge.net
 Sent: Thursday, February 23, 2012 8:34 AM
 Subject: Re: [Ganglia-general] Ganglia gmond memory leak?


 I've restarted the gmond process and memory usage drops until gmond hogs 
 memory over time. ?Any Ganglia contributors who may want to chime in on this 
 memory leak issue? ?I'm on Ganglia 3.2.0. ?Are there any improvements on 
 version 3.3.1 addressing this issue?


 Thanks

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread svd.ganglia
makes sense, but i know in my case the number of metrics was constant 
after the server gmond had been started for about 10 minutes all gmetric 
crons had a chance to submit an initial value.


-scott

On Thu, 23 Feb 2012, Matt Massie wrote:


Each unique metric (keyed on metric name) requires memory space in gmond.  
A good test is to peek at the number of metrics in gmond over time, e.g.
$ telnet localhost 8649 | grep METRIC | wc -l

If the number of metrics over time increases, so will the memory use.

Ganglia will release the metric and memory when the age of the metric is 
greater than DMAX.  A DMAX value of zero will cause ganglia to hold the metric
indefinitely.  In order to make sure that ganglia is releasing old metrics, set 
the DMAX value to something like 5 minutes (300 secs).

For example, lets assume you are doing per process monitoring and the metric 
name looks like

cpu_user.%d % (pid,)

Over time, you'll have lots of metrics (cpu_user.343493, cpu_user.343022, 
cpu_user.232323) that start accumulating and taking up memory space.

-Matt--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread Aidan Wong
That one node that recently had the running away memory leak was sending
253 metrics.  I'm using unicast sending all metrics to a specific host
where I have configured the udp_send_channel with the host and port
attributes defined.

On 2/23/12 5:36 AM, Jesse Becker haw...@gmail.com wrote:

How many metrics are you monitoring?  gmond must allocated memory for
each metric, from each host.  If you are using multicast, each gmond
instance will get metrics from all other instances.

If you run gmond in isolation--no traffic to/from other gmond
instances--does memory usage still go up?

On Wed, Feb 22, 2012 at 17:10, Aidan Wong aidanw...@attinteractive.com
wrote:
 Hi it looks like my install of gmond version 3.2.0 is leaking memory.
The
 amount of resident used memory that the process uses, gets up pretty
high
 and keeps increasing.

 USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
 root 18647  0.0  9.9 2965464 1836268 ? Ss   Jan14  11:24
 /home/t/hadoop-ganglia-client/sbin/gmond -c
 /home/t/hadoop-ganglia-client/gmond.conf -p
 /home/t/hadoop-ganglia-client/logs/gmond.pid

 Is this a bug?  Can anyone suggest a solution?

 Thank you

 
-
-
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general




-- 
Jesse Becker




--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread Aidan Wong
To me it looks like gmond memory usage goes on as long as there is memory 
resource left and I've seen some nodes with gmond causing swapping.

Before restart of gmond:
$ free -m
 total   used   free sharedbuffers cached
Mem: 18038  13217   4820  0163   8719
-/+ buffers/cache:   4335  13703
Swap: 5945   4795   1150
$ ps aux | grep gmond
1595016419  0.0  0.0  61180   760 pts/0S+   19:25   0:00 grep gmond
root 16804  0.0  5.3 9195200 979944 ?  Ss2011  36:06 
/home/t/hadoop-ganglia-client/sbin/gmond -c 
/home/t/hadoop-ganglia-client/gmond.conf -p 
/home/t/hadoop-ganglia-client/logs/gmond.pid

After restart of gmond:
$ free -m
 total   used   free sharedbuffers cached
Mem: 18038  13715   4322  0165   8842
-/+ buffers/cache:   4708  13330
Swap: 5945151   5794
$ ps aux | grep gmond
root 18492  0.0  0.0  43228  1348 ?Ss   19:26   0:00 
/home/t/hadoop-ganglia-client/sbin/gmond -c 
/home/t/hadoop-ganglia-client/gmond.conf -p 
/home/t/hadoop-ganglia-client/logs/gmond.pid
1595018717  0.0  0.0  61184   772 pts/0S+   19:27   0:00 grep gmond


From: Martin Knoblauch kn...@knobisoft.demailto:kn...@knobisoft.de
Reply-To: Martin Knoblauch kn...@knobisoft.demailto:kn...@knobisoft.de
Date: Thu, 23 Feb 2012 05:56:26 -0800
To: Jesse Becker haw...@gmail.commailto:haw...@gmail.com, Aidan Wong 
aidanw...@attinteractive.commailto:aidanw...@attinteractive.com
Cc: ganglia-general 
ganglia-general@lists.sourceforge.netmailto:ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?

Hi Jesse,

 but in that case the memory footprint of gmond would approach a maximum 
after some time - correct? Aidan did not say whether it grows forever or goes 
asymptotic. Aidan?

Cheers
Martin
--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de

From: Jesse Becker haw...@gmail.commailto:haw...@gmail.com
To: Aidan Wong 
aidanw...@attinteractive.commailto:aidanw...@attinteractive.com
Cc: ganglia-general 
ganglia-general@lists.sourceforge.netmailto:ganglia-general@lists.sourceforge.net
Sent: Thursday, February 23, 2012 2:36 PM
Subject: Re: [Ganglia-general] Ganglia gmond memory leak?

How many metrics are you monitoring?  gmond must allocated memory for
each metric, from each host.  If you are using multicast, each gmond
instance will get metrics from all other instances.

If you run gmond in isolation--no traffic to/from other gmond
instances--does memory usage still go up?

On Wed, Feb 22, 2012 at 17:10, Aidan Wong 
aidanw...@attinteractive.commailto:aidanw...@attinteractive.com wrote:
 Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The
 amount of resident used memory that the process uses, gets up pretty high
 and keeps increasing.

 USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
 root 18647  0.0  9.9 2965464 1836268 ? Ss   Jan14  11:24
 /home/t/hadoop-ganglia-client/sbin/gmond -c
 /home/t/hadoop-ganglia-client/gmond.conf -p
 /home/t/hadoop-ganglia-client/logs/gmond.pid

 Is this a bug?  Can anyone suggest a solution?

 Thank you

 --
 Virtualization  Cloud Management Using Capacity Planning
 Cloud computing makes use of virtualization - but cloud computing
 also focuses on allowing computing to be delivered as a service.
 http://www.accelacomm.com/jaw/sfnl/114/51521223/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.netmailto:Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general




--
Jesse Becker

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.netmailto:Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-23 Thread Robin Humble
On Thu, Feb 23, 2012 at 07:22:36PM +, Aidan Wong wrote:
That one node that recently had the running away memory leak was sending
253 metrics.  I'm using unicast sending all metrics to a specific host
where I have configured the udp_send_channel with the host and port
attributes defined.

IIRC plugins are loaded once and then run within gmond's address space.
so I guess plugins could be causing memory leaks.
which plugins are you using? they alloc/free as they should?

we haven't notived any leaks (certainly no serious leaks) across our
~1800 gmonds using 3.2.0, but we aren't sending that many metrics
either - just using most of the standard stuff plus modified diskstat,
cputemp python plugins, and with a bunch of other metrics spoof'd from
chassis and switches (more cpu cycles for HPC job this way).
we are using multicast.
all except a few gmonds (not included below) are senders only.

 %CPU %MEMVSZ   RSS  COMMAND
min   0.0  0.0  70972  1864 /usr/sbin/gmond
median0.0  0.0  70972  3392 /usr/sbin/gmond
ave   0.0  0.0  70977  3240 /usr/sbin/gmond
max   0.0  0.0  71104  4720 /usr/sbin/gmond

those with the larger RSS have been rebooted recently and haven't yet
had unused pages pushed out by vm pressure.

cheers,
robin
--
Dr Robin Humble, HPC Systems Analyst, NCI National Facility

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Ganglia gmond memory leak?

2012-02-22 Thread Aidan Wong
Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The 
amount of resident used memory that the process uses, gets up pretty high and 
keeps increasing.

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root 18647  0.0  9.9 2965464 1836268 ? Ss   Jan14  11:24 
/home/t/hadoop-ganglia-client/sbin/gmond -c 
/home/t/hadoop-ganglia-client/gmond.conf -p 
/home/t/hadoop-ganglia-client/logs/gmond.pid

Is this a bug?  Can anyone suggest a solution?

Thank you
--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Ganglia gmond memory leak?

2012-02-22 Thread Aidan Wong
I've restarted the gmond process and memory usage drops until gmond hogs memory 
over time.  Any Ganglia contributors who may want to chime in on this memory 
leak issue?  I'm on Ganglia 3.2.0.  Are there any improvements on version 3.3.1 
addressing this issue?

Thanks

From: Ave-Lallemant, Nathan P 
nathan.p.ave-lallem...@efleets.commailto:nathan.p.ave-lallem...@efleets.com
Date: Wed, 22 Feb 2012 16:31:58 -0600
To: Aidan Wong 
aidanw...@attinteractive.commailto:aidanw...@attinteractive.com, 
ganglia-general 
ganglia-general@lists.sourceforge.netmailto:ganglia-general@lists.sourceforge.net
Subject: RE: Ganglia gmond memory leak?

I have seen the same behavior in my environment but do not have a solution.


Nathan

From: Aidan Wong [mailto:aidanw...@attinteractive.com]
Sent: Wednesday, February 22, 2012 4:10 PM
To: ganglia-general
Subject: [Ganglia-general] Ganglia gmond memory leak?

Hi it looks like my install of gmond version 3.2.0 is leaking memory.   The 
amount of resident used memory that the process uses, gets up pretty high and 
keeps increasing.

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root 18647  0.0  9.9 2965464 1836268 ? Ss   Jan14  11:24 
/home/t/hadoop-ganglia-client/sbin/gmond -c 
/home/t/hadoop-ganglia-client/gmond.conf -p 
/home/t/hadoop-ganglia-client/logs/gmond.pid

Is this a bug?  Can anyone suggest a solution?

Thank you


CONFIDENTIALITY NOTICE: This e-mail and any files transmitted with it are 
intended solely for the use of the individual or entity to whom they are 
addressed and may contain confidential and privileged information protected by 
law. If you received this e-mail in error, any review, use, dissemination, 
distribution, or copying of the e-mail is strictly prohibited. Please notify 
the sender immediately by return e-mail and delete all copies from your system.


--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general