I have no answers, only vaguely-informed statements and half-formed questions (welcome to free software's version of tech support!).

It is interesting to note that 4 hours = 16 real data points (at 15-minute polling intervals). That's a suspiciously round number...

However, if this was just a matter of the graphs not displaying because there weren't enough interpolated data points to render a nice graph, I would have expected them all to pop up at the same time...

At this point, I would want to compare gmetad's raw XML output to the output of the monitoring core it's polling. The polled monitoring core should have a full metric payload. If it doesn't, then you've got some kind of multicast problem involving that node. Compare its output to the rest of the cluster, that may help. Otherwise, crank up the debug level and your favorite packet sniffer...

Assuming the monitoring core's OK, then the metadaemon's being funny. What happens if you restart the metadaemon? What happens if you change the polling interval?

I shudder to recommend running gmetad with debug on for several hours ... heck, I left it on for ten minutes once and it made me cry, but that was on a few hundred nodes. :) Unfortunately, that may be the only thing that can offer you clues as to why it's taking so long for new nodes to pop up.

gmetad should be walking straight through the <HOST> tags in the XML parser. Maybe there's something keeping it from finishing... assuming the new hosts are at the bottom?

Just a few ideas for you.  Good luck!

Marcia Prescott wrote:
I've been using ganglia for a couple of months.  In
this time it has been great.  I set it up so that I
would gather the information at longer intervals of 15
minutes between polls.  This has been working great!

A couple of days ago, I started gmond on 30 nodes of a
cluster. I was already monitoring 3 of those nodes. The data has been slow to join the summary information. Could this be because I changed the polling intervals?
I would think that the nodes that I added would all
show up at the same time.  Instead, a new node joins
the summary information about every 4 hours.  Why is
this?  It is about every four hours that graphs appear
for the nodes, as well.
I can see the graphical representation for all of the
nodes, but they lead to a page with no collected data
for the node.
The XML data is sending the information for all of the
nodes, but the Gmetad doesn't seem to be collecting
the information as expected. Why does it take so long
to collect the data?

Thanks,

Marcia Prescott

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general



Reply via email to