We're in the process of setting up a new HPC environment consisting of a few
head nodes and 40 compute nodes, all running RedHat Linux 5.  We installed
ganglia-gmetad-3.0.7-1.i386.rpm<http://downloads.sourceforge.net/ganglia/ganglia-gmetad-3.0.7-1.i386.rpm?modtime=1204128845&big_mirror=0>on
the head node, and  ganglia-gmond-3.0.7-1.i386.rpm on each of the
compute
nodes as well as the head node. The head node is multi-homed, with eth1
being on the network that all the compute nodes are on, so gmond.conf on the
head node has the following:

udp_send_channel {
  mcast_join =  239.2.11.71
  mcast_if = eth1
  port = 8649
  ttl = 1
}

  mcast_join = 239.2.11.71
  mcast_if = eth1
  port = 8649
  bind =  239.2.11.71
}

We also added a static route on the head node:

# route add -host  239.2.11.71 dev eth1

The gmond.conf on the compute nodes is virtually identical except they don't
have the mcast_if entries since they're all single homed.

The problem we're having is that none of the gmond data from the compute
nodes seems to be making it to the head node.  gmetad seems to only be
collecting data from the head node.  If we telnet to port 8649 on the head
node it only shows data about the head node.  And when we log into the web
interface the only node listed is the head node.

I don't know if this is related in any way or not but another problem is
that none of the graphs appear to be getting properly created.  In Firefox
when I click on a link to view the load, etc. of the head node it just echos
back a URL like this (although it appears to be as an image and not text):
https://<server>/ganglia/graph.php?g=load_report&z=large&c=Cluster02&m=&r=hour&s=descending&hc=4&st=1216651527.
On some other browsers the graphs show up as broken links.  Any idea why
this might be happening?

Thanks,

-Bruce


[image:
https://management02/ganglia/graph.php?g=load_report&z=large&c=Cluster02&m=&r=hour&s=descending&hc=4&st=1216651527][image:
https://management02/ganglia/graph.php?g=load_report&z=large&c=Cluster02&m=&r=hour&s=descending&hc=4&st=1216651527]
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to