I had this problem for a cluster. Ganglia version 3.0.1 In the cluster view, only some hosts had graphs, ans the rollup displays either were not created at all, or were not updated.
It turned out it was "caused" by two nodes in my cluster (different Ips), having the same name upon reverse lookup. a DNS error of course, but it means that it appeared a host was included twice in the XML stream to gmetad. But why was it bombing? It turns out that the processing of the second instance of some host would have the same timestamp for a metric as the first, and so <snip> (rrd_helpers.c, line 187) rrd_update(argc, argv); if(rrd_test_error()) { err_msg("RRD_update (%s): %s", rrd, rrd_get_error()); pthread_mutex_unlock( &rrd_mutex ); return 1; } <snip> errored with: /usr/sbin/gmetad[12463]: RRD_update /rrds/FITest1/ldnpsm020000295.intranet.barcapint.com/mem_total.rrd): illegal attempt to update using time 1132851721 when last update time is 1132851721 (minimum one second step) I did not trace it back up the call tree, but an error return here seems to stop further processing of the xml and rollups for that data source. I of course hacked the return 1 to a return 0, and everything worked, despite my admitedly buggy XML data supply. This was painful to find, so that's why I share it with you. The true ix lies higher up of course, but I leave that to the gurus. kind regards, Richard ------------------------------------------------------------------------ For more information about Barclays Capital, please visit our web site at http://www.barcap.com. Internet communications are not secure and therefore the Barclays Group does not accept legal responsibility for the contents of this message. Although the Barclays Group operates anti-virus programmes, it does not accept responsibility for any damage whatsoever that is caused by viruses being passed. Any views or opinions presented are solely those of the author and do not necessarily represent those of the Barclays Group. Replies to this email may be monitored by the Barclays Group for operational or business reasons. ------------------------------------------------------------------------