Hi Bruce:

On Mon, Jul 21, 2008 at 7:49 AM, Bruce Pennypacker
<[EMAIL PROTECTED]> wrote:

> We're in the process of setting up a new HPC environment consisting of a few
> head nodes and 40 compute nodes, all running RedHat Linux 5.  We installed
> ganglia-gmetad-3.0.7-1.i386.rpm on the head node, and
> ganglia-gmond-3.0.7-1.i386.rpm on each of the compute nodes as well as the
> head node. The head node is multi-homed, with eth1 being on the network that
> all the compute nodes are on, so gmond.conf on the head node has the
> following:
>
> udp_send_channel {
>   mcast_join =  239.2.11.71
>   mcast_if = eth1
>   port = 8649
>   ttl = 1
> }
>
>   mcast_join = 239.2.11.71
>   mcast_if = eth1
>   port = 8649
>   bind =  239.2.11.71
> }
>
> We also added a static route on the head node:
>
> # route add -host  239.2.11.71 dev eth1
>
> The gmond.conf on the compute nodes is virtually identical except they don't
> have the mcast_if entries since they're all single homed.
>
> The problem we're having is that none of the gmond data from the compute
> nodes seems to be making it to the head node.  gmetad seems to only be
> collecting data from the head node.  If we telnet to port 8649 on the head
> node it only shows data about the head node.  And when we log into the web
> interface the only node listed is the head node.

Just as a test, could you perhaps switch eth0 and eth1 (i.e. use eth0
as the nic to communicate with your compute nodes on the headnode),
remove the static route and see if this changes anything?

I suspect your issue is network routing related, and is not directly
related to Ganglia.

If multicast does not work out, you can try unicast.  For more
information, check out the man page for gmond.conf.

Cheers,

Bernard

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to