Alex Balk wrote:

Ole,


That's an interesting setup you have there. Sounds quite similar to what
I've put together, except for the DB backend. I avoided that as I didn't
want System personnel to have to deal with a DB. I especially like your
idea on generating gmond.conf files... I assume you account for "DB
offline" cases through high availability features in Oracle?

It the DB is offline a configuration won't form and it won't update the node config. Though I've not seen this yet (in ~2 years, but I have 24x7 DBA support). It also helps when going from 2.6 to 3.0.X since the script will be updated for these nodes. Though we have not deployed this yet on production hosts, and I will have to do some changes in 3.0.X before this deployment (for example for load).

Regarding your daily update mechanism, I've found that a timeout won't
do. Using a timeout to prevent DoS simply doesn't scale. Once you have
20k nodes (or some other high number) you might find that you're killing
the server by sheer query mass.To avoid this problem I've spread the update 
interval over a random time
period on each node. On top of that, using a map file to determine
"who's serving who" in the Ganglia setup allowed me to distribute load
among several machines. All in all, these two methods allow me to use
hourly updates, with the load on the top gmetads (who also serve updates
to map files, gmetrics, etc) remaining under 1.0.


You say that your grid includes about 10k nodes. I've encountered an
issue with grid-wide memory metrics overflowing a uint32, since they're
reported in KB. To this effect, when the total amount of memory in the
grid reached 4TB, the uint32 storing total grid memory would roll over
to zero.To workaround this issue I've had to change gmond metric code to report
memory in MB. Not the most elegant solution, as in a few years we'll be
pushing that limit as well, but implementing  uint64 support in Ganglia
seemed just too complicated at the time.

Have you encountered this problem? and if so, how did you solve it?


I've experienced the same thing, though I don't rely too much on the grid wide memory metrics (that's metric is more interesting for host analysis). I rely more on the cpu and load metrics. As well as the cpu metrics on the graphical front end I have some reporting scripts which report on the RRDs at the back end, which iterate over the clusters if the limit is reached. ( another script which should be shared with everyone... )

A uint64 implementation is definately desirable noth for me and for future proofing ganglia I think..

Sorry if my original explanation wasn't clear enough but the total deployment is ~10K. However the deployment is global. For my update script (pull from the node) the implementation is a random time out from the node (which it doesn't exceed an hour) and ganglia ( :-) ) shows the configuration web server able to deal with this load. I've got 6 regional gmetads and the time-zone with the highest deployments has ~4K hosts.
Cheers,

Ole

Cheers,

Alex


Ole Turvoll wrote:

Chris, all,

I'm in the same legal position as Alex (In addition I'm not allowed to
use my work email address and rely on my ISP email service is only up
intermittently - but webmail is blocked while I'm at work).

However I'd like to share my experience.

Our hierarchy (by geography) is as follows:

Global gmetad
   |
   |
Regional gmetad (collecting every 60 seconds)
   |
   |
send_receive gmonds (from 10 ~ 400 nodes)     |
   |
nodes (~10k) sending udp unicast


I agree with Alex's notions though we do not use gmetrics
functionality, for various reasons, mainly around the impact of
loading scripts and the manageability around using this solution.

What we've implemented is a global gmond web server configuration
engine.  The architecture of this is an oracle database with a web
front end which controls our ganglia architecture.

A walk through of functionality:

On the nodes (gmond package)
With each gmond package I include a perl script which sends variables
taken from the local host (fqdn, interface) via HTTP to the web server. On the server (gmetad package)
The PHP cgi script sitting on the web server will return to the node a
gmond.conf from which specifics of which gmonds it will report to are
included. Depending on the fqdn the PHP cgi script gets it will either (1) enter
the host into a default gmond (updating the database) or (2) send its
configuration file back with the predesigned gmonds (taken from the
database) it will report to.

Finally on the node end it starts the gmond (last phase of a package
install) with the newly acquired gmond.conf.

Some other points about the architecture.
- It uses the TemplatePower engine.
- A cronjob checks that the gmond.conf is up to date every day at 12
pm local time (there is no DoS since we've included a timeout on the
node side)
- The database tables are very simple.
- Anyone can bulk update the database through a simple DBD perl script
- A web front end for the database table allows us to easily view the
send_receive gmonds and the send gmonds, which enables us to
understand and manage our environment with very low overhead.
That's all I can think of for now....  Any questions/queries are welcome.

Unfortunately I'm in the same position as Alex, would like to share
this but am not aware if I can at this time.

Thanks,

Ole


Chris Croswhite wrote:

Alex,

Thanks for the great information.  I'll check out the Jan email thread
and then follow up with more questions.

BTW, the script statement did come across rather badly, sorry about that
(and after all that PC training I was required to take!)

Thanks
Chris

On Thu, 2006-02-23 at 10:36, Alex Balk wrote:


Chris Croswhite wrote:

Alex,

Yeah, I already have a ton of questions and need some pointers in
large
scale deploys (best practices, do's, dont's, etc,).


Till I get the legal issues out of the way, I can't share the
scripts...
What I can do, however, is share the ideas I've implemented, as those
were developed outside the customer environment and were just spin-offs
of common concepts like orchestration, federation, etc.
Here's a few things:

  * When unicasting a tree hierarchy of nodes could provide useful
    drill-down capabilities.
  * Most organization already have some form of logical grouping for
    cluster nodes. For example: faculty, course, devel-group, etc.
    Within those groups one might find additional logical
    partitioning. For example: platform, project, developer, etc.
    Using these as the basis for constructing your logical hierarchy
    provides real-world basis for information analysis, saves you the
    trouble of deciding how to construct your grid tree and prevents
    gmond aggregators from handling too many nodes (though I've found
    that a single gmond can store information for 1k nodes without
    noticeable impact on performance).
  * Nodes will sometimes move between logical clusters. Hence,
    whatever mechanism you have in place has to detect this and
    regenerate its gmond.conf.
  * Using a central map which stores "cluster_name gmond_aggregator
    gmetad_aggregator" will save you the headache of figuring out who
    reports where, who pulls info from where, etc. If you take this
    approach be sure to cache this file locally or put it on your
    parallel FS (if you use one). You wouldn't want 10k hosts trying
    to retrieve it from a single filer.
  * The same map file approach can be used for gmetrics. This allows
    anyone in your IT group to add custom metrics without having to be
    familiar with gmetric and without having to handle crontabs. A
    mechanism which reads (the cached version of) this file could
    handle inserting/removing crontabs as needed.

Also, check out the ganglia-general thread from January 2006 called
"Pointers on architecting a large scale ganglia setup".


I would love to get my hands on your shell scripts to figure out what
you are doing (the unicast idea is pretty good).


Okay, that sounds almost obscene ;-)


Cheers,
Alex

Chris


On Thu, 2006-02-23 at 09:35, Alex Balk wrote:

Chris,


Cool! Thanks!

If you need any pointers on large-scale deployments, beyond the
excellent thread that was discussed here last month, drop us a
line. I'm
managing Ganglia on a cluster of about the same size as yours,
spanning
multiple sites.


I've developed a framework for automating the deployment of
Ganglia in a
federated mode (we use unicast). I'm currently negotiating the
possibility of releasing this framework to the Ganglia community.
It's
not the prettiest piece of code, as it's written in bash and spans
a few
thousands lines of code (I didn't expect it to grow into something
like
that), but it provides some nice functionality like map-based logical
clusters, automatic node migration between clusters, map-based
gmetrics,
and some other candies.

If negotiations fail I'll consider rewriting it from scratch in
perl on
my own free time.


btw, I think Martin was looking for a build on HP-UX 11...


Cheers,

Alex


Chris Croswhite wrote:

This raises another issue, which I believe is significant to the
development process of Ganglia. At the moment we don't seem to have
(correct me if I'm wrong) official testers for various platforms.
Maybe we could have some people volunteer to be official beta
testers?
We wouldn't have to have a release out the door without properly
testing it under most OS/archs.
The company I work for is looking to deploy ganglia across all
compute
farms, some ~10k systems.  I could help with beta testing on these
platforms:
HP-UX 11+11i
AIX51+53
slowlaris7-10
solaris10 x64
linux32/64 (SuSE and RH)

Just let me know when you have a new candidate and I can push the
client
onto some test systems.

Chris




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
that extends applications into web and mobile media. Attend the live
webcast
and join the prime developer group breaking into this new coding
territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers








Reply via email to