Re: [Ganglia-general] Java/JMX plugin for Ganglia 3.1.x
We have all of our metrics of interest generated by Coda Hale's metrics package: http://metrics.codahale.com/ That includes a ganglia reporter that can be used to send metrics to gmond. (But not arbitrary pre-existing beans.) On 09/13/2012 08:43 AM, Martin Knoblauch wrote: Hi, as part of a larger tomcat deployment I need to monitor several tomcat instances and want to add the measured data to a Ganglia setup. I already found JMXtrans which seems a cool solution, but it uses host spoofing and I am not sure it is what I really want. Needs some real investigating. What I would love would to have would be a Gmond plugin that just can add the measured metric to the system metrics. Has anybody already done such a plugin or is working on it? I could provide testing, feedback and maybe help. Cheers Martin -- Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Java/JMX plugin for Ganglia 3.1.x
Martin, If you can upgrade to the latest Ganglia release you could use sFlow to monitor your Tomcat servers, the jxm-sflow-agent exports standard JVM metrics, or the tomcat-sflow-valve can export the JVM metrics as well as HTTP counters and transactions. http://host-sflow.sourceforge.net/relatedlinks.php Cheers, Peter On Thu, Sep 13, 2012 at 5:43 AM, Martin Knoblauch kn...@knobisoft.de wrote: Hi, as part of a larger tomcat deployment I need to monitor several tomcat instances and want to add the measured data to a Ganglia setup. I already found JMXtrans which seems a cool solution, but it uses host spoofing and I am not sure it is what I really want. Needs some real investigating. What I would love would to have would be a Gmond plugin that just can add the measured metric to the system metrics. Has anybody already done such a plugin or is working on it? I could provide testing, feedback and maybe help. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general -- Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Impact of gmond polling on data collection
We use ganglia to monitor 500 hosts in multiple datacenters with about 90k unique host:metric pairs per DC. We use this data for all of the cool graphs in the web UI and for passive alerting. One of our checks is to measure TN of load_one on every box (we want to make sure gmond is working and correctly updating metrics otherwise we could be blind and not know it). We consider it a failure if TN is 600. This is an arbitrary number but 10 minutes seemed plenty long. Unfortunately we are seeing this check fail far too often. We set up two parallel gmetad instances (monitoring identical gmonds) per DC and have broken our problem into two classes: * (A) only one of the gmetad stops updating for an entire cluster, and must be restarted to recover. Since the gmetad's disagree we know the problem is there. [1] * (B) Both gmetad's say an individual host has not reported (gmond aggregation or sending must be at fault). This issue is usually transient (that is it recovers after some period of time greater than 10 minutes). While attempting to reproduce (A) we ran several additional gmetad instances (again polling the same gmonds) around 2012-12-07. Failures per day are below [2]. The act of testing seems to have significantly increased the number of failures. This lead us to consider if the act of polling a gmond aggregator could impact the ability for it to concurrently collect metrics. We looked at the code but are not experienced with concurrent programming in C. Could someone with more familiarity with the gmond code comment as to if this is likely to be a worthwhile avenue of investigation? We are also looking to for suggestion for an empirical test to rule this out. (Of course, other comments on the root TN goes up, metrics stop updating sporadic problem are also welcome!) Thank you, Chris Burroughs [1] https://github.com/ganglia/monitor-core/issues/47 [2] 120827 89 120828 6 120829 3 120830 4 120831 5 120901 1 120902 6 120903 2 120904 9 120905 4 120906 70 120907 523 120908 85 120909 4 120910 6 120911 2 120912 5 120913 5 -- Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general