Re: [Ganglia-general] Java/JMX plugin for Ganglia 3.1.x

2012-09-14 Thread Chris Burroughs
We have all of our metrics of interest generated by Coda Hale's metrics
package: http://metrics.codahale.com/

That includes a ganglia reporter that can be used to send metrics to
gmond.  (But not arbitrary pre-existing beans.)

On 09/13/2012 08:43 AM, Martin Knoblauch wrote:
 Hi,
 
  as part of a larger tomcat deployment I need to monitor several tomcat 
 instances and want to add the measured data to a Ganglia setup. I already 
 found JMXtrans which seems a cool solution, but it uses host spoofing and I 
 am not sure it is what I really want. Needs some real investigating.
 
 
  What I would love would to have would be a Gmond plugin that just can add 
 the measured metric to the system metrics. Has anybody already done such a 
 plugin or is working on it? I could provide testing, feedback and maybe help.
 
 Cheers
 
 Martin 


--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Java/JMX plugin for Ganglia 3.1.x

2012-09-14 Thread Peter Phaal
Martin,

If you can upgrade to the latest Ganglia release you could use sFlow
to monitor your Tomcat servers, the jxm-sflow-agent exports standard
JVM metrics, or the tomcat-sflow-valve can export the JVM metrics as
well as HTTP counters and transactions.

http://host-sflow.sourceforge.net/relatedlinks.php

Cheers,
Peter

On Thu, Sep 13, 2012 at 5:43 AM, Martin Knoblauch kn...@knobisoft.de wrote:
 Hi,

  as part of a larger tomcat deployment I need to monitor several tomcat
 instances and want to add the measured data to a Ganglia setup. I already
 found JMXtrans which seems a cool solution, but it uses host spoofing and
 I am not sure it is what I really want. Needs some real investigating.

  What I would love would to have would be a Gmond plugin that just can add
 the measured metric to the system metrics. Has anybody already done such a
 plugin or is working on it? I could provide testing, feedback and maybe
 help.

 Cheers
 Martin
 --
 Martin Knoblauch
 email: k n o b i AT knobisoft DOT de
 www: http://www.knobisoft.de

 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Impact of gmond polling on data collection

2012-09-14 Thread Chris Burroughs
We use ganglia to monitor  500 hosts in multiple datacenters with about
90k unique host:metric pairs per DC.  We use this data for all of the
cool graphs in the web UI and for passive alerting.

One of our checks is to measure TN of load_one on every box (we want to
make sure gmond is working and correctly updating metrics otherwise we
could be blind and not know it).  We consider it a failure if TN is 
600.  This is an arbitrary number but 10 minutes seemed plenty long.

Unfortunately we are seeing this check fail far too often.  We set up
two parallel gmetad instances (monitoring identical gmonds) per DC and
have broken our problem into two classes:
 * (A) only one of the gmetad stops updating for an entire cluster, and
must be restarted to recover.  Since the gmetad's disagree we know the
problem is there. [1]
 * (B) Both gmetad's say an individual host has not reported (gmond
aggregation or sending must be at fault).  This issue is usually
transient (that is it recovers after some period of time greater than 10
minutes).

While attempting to reproduce (A) we ran several additional gmetad
instances (again polling the same gmonds) around 2012-12-07.  Failures
per day are below [2].  The act of testing seems to have significantly
increased the number of failures.

This lead us to consider if the act of polling a gmond aggregator could
impact the ability for it to concurrently collect metrics.  We looked at
the code but are not experienced with concurrent programming in C.
Could someone with more familiarity with the gmond code comment as to if
this is likely  to be a worthwhile avenue of investigation?  We are also
looking to for suggestion for an empirical test to rule this out.

(Of course, other comments on the root TN goes up, metrics stop
updating sporadic problem are also welcome!)

Thank you,
Chris Burroughs


[1] https://github.com/ganglia/monitor-core/issues/47

[2]
120827  89
120828  6
120829  3
120830  4
120831  5
120901  1
120902  6
120903  2
120904  9
120905  4
120906  70
120907  523
120908  85
120909  4
120910  6
120911  2
120912  5
120913  5

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general