Jonathan Pauli wrote:

Is there a way to remove some of the standard metrics
from gmond? We are running ganglia on a 300 node cluster
and are seeing some performance issues especialy with the CPU usage on the head node (running gmetad). It would also
be desirable to cut down on teh clutter on the web page, and
cut down multicast traffic if possible.

Thank you much.



Check out gmond/metric.h - it has all the metrics and metric
thresholds in there. In most cases the common metrics are the CPU load and memory usage. Increasing the delays on those metrics will probably help you out. Of course, it does involve rebuilding, redistributing and restarting gmond. Hopefully you have the infrastructure in place...

As for reducing CPU usage on the gmetad node ... whew. May want to check the archives. There are several potential bottlenecks and we've gone over all of them at one point or another. Disk I/O seems to be the first one people hit on Linux boxes. A temporary filesystem or even a journalling filesystem may help with performance.

Another thing to consider is, is gmetad using a lot of CPU time when *no one* is hitting the web front-end? If so then XML parsing (doubtful) or RRD updating (much more likely) is where all the CPU-cycle-sucking is happening.

If you're getting those fabulous 8-to-10-second page loads like I am, you just have to wait for a gmetad and web front-end that support interactive queries, unfortunately (reducing the number of metrics transmitted would actually address this, too). Parsing 3MB (or more) of XML *in PHP* on *every page load* isn't fun...

You can also reduce the number of graphs shown per page. We just went over that last month, I believe...

Hope this helps.


Reply via email to