Hi,

I've been hacking on the ganglia gmond code to get the agent to
auto-discover other servers in its cluster when running in EC2 [1]. It
works a lot like the way elasticsearch does [2].

To get it to work, you add the following stanzas to the gmond.conf...

/* Dynamic discovery for cloud environments */
cloud {
  aws_access_key = INSERT_YOUR_ACCESS_KEY
  aws_secret_key = INSERT_YOUR_SECRET_KEY
}

discovery {
  type = ec2 /* only ec2 API supported so far */
# endpoint = https://ec2.amazonaws.com /* only required if in us-east-1 */
  tags = { stage:dev } /* stage:prod */
  groups = { quicklaunch-1 } /* security groups */
  availability_zones = { us-east-1d } /* eg. eu-west-1a */
  discover_every = 90
  host_type = public_dns /* private_ip, public_ip, private_dns, public_dns
*/
  port = 8649
}

Then at start-up, gmond uses the filter defined by combining the tags,
groups and availability zones that you define in the discovery section to
find the list of matching EC2 instances using the EC2 API.

Whenever a new instance comes up (as part of a scaling group, or whatever)
and sends metrics to existing instances it triggers those gmonds to do
another discovery which should find the new server.

It will also do a rediscovery every so often (by default every 90 seconds)
so that instances that have been terminated are removed from its list of
UDP send destinations.

This all works really well so far. The only thing I can't work out is how
to support gmetric. If I understand gmetric correctly it works out what the
UDP send destinations should be by reading in the gmond.conf file. However,
if gmond is using EC2 discovery there are no static destinations listed.
One solution might be for gmetric to query the EC2 API for the list the
same way gmond does but this would add quite an overhead to a lightweight
CLI.

Also, we use gmetric quite a lot (called 1000's of times a minute) on some
servers which would not scale if each gmetric exec had to query the EC2 API
first.

Does anyone have any suggestions on how I might get gmetric to work in a
scalable way if it can't rely on the UDP send destinations being listed in
the gmond.conf file? It really is a show-stopper for us at the moment which
is unfortunate because gmond would work brilliantly in EC2 with these
changes.

Thanks in advance,
Nick

[1] https://github.com/satterly/monitor-core
[2] http://www.elasticsearch.org/guide/reference/modules/discovery/ec2.html
 and
http://www.elasticsearch.org/tutorials/2011/08/22/elasticsearch-on-ec2.html
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to