I personally would live the collection period to default e.g. 15 seconds. You don't really buy much by having the collection period at 45 seconds and it may cause other problems since some metrics may have low lifetime.

Vladimir

On Fri, 18 Mar 2011, Ron Cavallo wrote:

Jesse,

I did follow both your suggestion as well as Bernards and the problem has gone away. I 
adjusted the "data source" collection interval to 45 seconds.

Also, I have a python script I wrote with pexpect that will ssh into all of 
your servers and stop, start, or restart gmond as directed if anyone wants it. 
Makes restarting your site easier.

Thanks for the help.

-Regards

Ron Cavallo
Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-940-5079 (fax)
646-315-0119(C)
www.saks.com


-----Original Message-----
From: Jesse Becker [mailto:haw...@gmail.com]
Sent: Wednesday, March 16, 2011 10:43 PM
To: Ron Cavallo
Cc: ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Web Frontend says that nodes are coming up and 
down , but they are not

Before I answer, I'll mention that Bernard's advice is good, and you
should follow it. :-)

The reason that the stop-gmond/bounce-gmetad/start-gmond process works
has to do with how gmond stores and shares data.  In most cases, gmond
will store data for every single *other* gmond that it hears about. In
the case of multicast, this can be a lot of hosts.  Gmetad also keeps
some state as well.  Shutting down the various parts will clear
everything out so you can start fresh.  This is really more useful
when you have gmond clients that you have decomissioned, but can't
remove from Ganglia.

Strictly speaking, you should be able to do:
1) stop gmond everywhere
2) stop gmetad
3) start gmond everywhere
4) start gmetad

But I find it easier to issue 3 commands instead of 4.  The web UI
also is offline when gmetad is not running, and minimizing that
"downtime" may be important in some circumstances.

On Wed, Mar 16, 2011 at 21:32, Ron Cavallo <ron_cava...@s5a.com> wrote:
I would gladly do that in that fashion, can you explain why this corrects
the problem?

Ron Cavallo
Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-451-3510 (fax)
646-315-0119(C)
www.saks.com <http://www.saks.com/>

----- Original Message -----
From: Jesse Becker <haw...@gmail.com>
To: Ron Cavallo
Cc: ganglia-general@lists.sourceforge.net
<ganglia-general@lists.sourceforge.net>
Sent: Wed Mar 16 19:53:29 2011
Subject: Re: [Ganglia-general] Web Frontend says that nodes are coming up
and down , but they are not

I've seen this occasionally.  The usual (and perhaps only) solution is
to shutdown *all* of the gmond processes running on your nodes.
Bounce gmetad, then start gmond everywhere.

On Wed, Mar 16, 2011 at 17:06, Ron Cavallo <ron_cava...@s5a.com> wrote:
Every minute nodes disappear from the web front end and the webfrontend
reports then as down. Then they get reported up a minute or so later, then
repeat. Any ideas what is going on?

This is what I see every couple of minutes:

Ron Cavallo

Sr. Director, Infrastructure
Saks Fifth Avenue / Saks Direct
12 East 49th Street
New York, NY 10017
212-451-3807 (O)
212-940-5079 (fax)

646-315-0119(C)

www.saks.com




------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general





--
Jesse Becker




--
Jesse Becker

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to