Re: [Ganglia-general] Setup large clusters

2008-03-12 Thread Jesse Becker
On Wed, Mar 12, 2008 at 11:35 AM, Martin Hicks <[EMAIL PROTECTED]> wrote:
>
>  Hi,
>
>  I'm wondering what the suggested setup is for a large Grid.  I'm having
>  trouble with scalaing ganglia to work on large clusters.

Have you considered turning off disk read-ahead on the partition that
includes the .rrd files?  That should help somewhat.  There's a good
paper that discusses the performance of MRTG (a heavy user of
rrdtool), and how to make it scale nicely:
http://www.usenix.org/event/lisa07/tech/plonka.html

Note that the patch to rrdtool mentioned in in the paper is includined
in rrdtool 1.2.24 and later.




-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Setup large clusters

2008-03-12 Thread Martin Hicks

On Wed, Mar 12, 2008 at 10:52:03AM -0500, Seth Graham wrote:
> Martin Hicks wrote:
> 
> >The configuration of gmetad has been modified to store the rrds in
> >/dev/shm, but this directory gets very large so I'd like to move away
> >from that.
> 
> Using tmpfs is pretty much your only option. As you discovered, the disk 
> I/O will bring most machines to their knees.

:( Seems like a pretty crappy use of that much memory.

cct506-1:~ # du -s --si /dev/shm/rrds/
477M/dev/shm/rrds/

I've seen the rrds directory at 1.5GB in production clusters.

> 
> >Is there a way that I should be architecting the configuration files
> >to make ganglia scale to work on this cluster?
> >
> >I think I want to run gmetad on each head node, and to use that RRD data 
> >without
> >regenerating it on the admin node.  Is that possible?
> 
> This is definitely possible, though I don't think it's necessary. I have 
>  machines handling 1500 reporting nodes without problems, writing the 
> rrds to a tmpfs.
> 
> The downside of setting up ganglia with head nodes is that you have to 
> set up some way to make the rrds available to a central web server. 
> Several ways to do that too, but they introduce their own headaches.

Right.  So I'd have to use NFS or something similar.

After I wrote this first e-mail I started wondering about the updates to
__SummaryInfo__.  How awful/expensive would that be if all of the
sub-cluster RRD files were nfs mounted?

Does that summary info get regenerated on the same poll interval as
everything else?

Thanks,
mh


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Setup large clusters

2008-03-12 Thread Seth Graham
Martin Hicks wrote:

> The configuration of gmetad has been modified to store the rrds in
> /dev/shm, but this directory gets very large so I'd like to move away
> from that.

Using tmpfs is pretty much your only option. As you discovered, the disk 
I/O will bring most machines to their knees.

> Is there a way that I should be architecting the configuration files
> to make ganglia scale to work on this cluster?
> 
> I think I want to run gmetad on each head node, and to use that RRD data 
> without
> regenerating it on the admin node.  Is that possible?


This is definitely possible, though I don't think it's necessary. I have 
  machines handling 1500 reporting nodes without problems, writing the 
rrds to a tmpfs.

The downside of setting up ganglia with head nodes is that you have to 
set up some way to make the rrds available to a central web server. 
Several ways to do that too, but they introduce their own headaches.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


[Ganglia-general] Setup large clusters

2008-03-12 Thread Martin Hicks

Hi,

I'm wondering what the suggested setup is for a large Grid.  I'm having
trouble with scalaing ganglia to work on large clusters.

Consider the following:

- Pretty well default gmond.conf distributed throughout all cluster
  members.

- 20 clusters of 64 nodes.  Gmond running on each cluster node, plus on
  the cluster head node

- gmond and gmetad running on the "admin" node, which has the Grid
  defined in gmetad, and polls the information from each of the cluster
  head nodes.

The configuration of gmetad has been modified to store the rrds in
/dev/shm, but this directory gets very large so I'd like to move away
from that.

Switching the rrd directory back to the default breaks.  As soon as
gmetad gets through its first round of grabbing metrics from the head
nodes, the machine starts writing a lot of small updates to disk,
completely consuming the machine:

procs ---memory-- ---swap-- -io -system-- -cpu--
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 3  0  17140 641292 453944 200960800 0 0 5092 20394 19 27 55  0 
 0
 3  0  17140 632364 453952 200960000 0 0 6358 17128 24 32 44  0 
 0
 1  0  17140 631744 453952 200960000 0 0 2579 7545  7 11 82  0  0
 0  0  17140 629264 453952 200960000 0 0 2099 11337  7  7 86  0 
 0
 0  0  17140 629264 453952 200960000 0 0  351  855  0  0 100  0 
 0
 0  1  17140 629264 453952 200960000 0  3456  986  793  0  0 59 41  0
 0  1  17140 629264 453952 200960000 0  3332 1159  897  0  0 50 50  0
 0  1  17140 629280 453952 200960000 0  3072 1019  814  0  0 50 50  0
 0  1  17140 629280 453952 200960000 0  1792  771  886  0  0 50 50  0
 0  1  17140 629280 453952 200960000 0  1284  588  761  0  0 50 50  0
 0  2  17140 629280 453952 200960000 0  1536  676  890  0  0 38 61  0
 0  2  17140 629296 453952 200960000 0  1280  613  763  0  0 50 50  0
 0  2  17140 629296 453952 200960000 0  2048  825  887  0  0 50 50  0


forever more...

Is there a way that I should be architecting the configuration files
to make ganglia scale to work on this cluster?

I think I want to run gmetad on each head node, and to use that RRD data without
regenerating it on the admin node.  Is that possible?

Thanks,
mh


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general