Hi,

I have a rather large and complicated ganglia deployment, which can
almost certainly be simplified, I just don't know how.

Anyway, the issue that I am trying to resolve is a gmetad which dozens
of gmond datasources is writing summary data for some of these
datasources, but not writing the host metrics, they all show NaN in
the rrds. Another gmetad with the same datasource IS able to write the
host metrics, so I dont know why this one wont. The datasource
configuration is identical (literally copy/paste from one config file
to generate a new one). Running the affected gmetad with -d 3, I
cannot see any "Updating host" lines for the hosts affected, but I can
see it writing the summary data.

The particular part of the deployment we are having issue with is
this: We have 2 ganglia collectors, running dozens of gmond instances
to collect data from different logical groups of servers (web server
group, app server group, cache server group etc). These are then
aggregated into their logical systems (external website, internal
website, database servers etc) by a number of gmetad instances, and
those gmetad instances are then aggregated by another gmetad (so we
have grid of grids, sometimes nested 3 deep).

We have one other gmetad instance running on a nagios machine with ALL
gmond instances as sources, to aggregate everything, so we can use the
nagios integration scripts.

Software versions:

clients are running a mix of 3.1.2 and 3.1.7
collector machines are running 3.1.7
nagios machine is running 3.4.0 (built from github to include the grid
of grids fix)

Finally, if someone has experience with deployments like this, and can
offer some advice on how to simplify down from dozens of gmond and
gmetad configs, that would be much appreciated. This has grown
organically over the years, and is simply getting out of hand.

Kindest regards,
Michael

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to