Re: [Ganglia-general] unable to survey all nodes

2002-06-04 Thread matt massie
May 23, marino vetuschi zuccolini wrote forth saying...

 Hello to all

 I've a dual frontend with two eth cards. The internal net (eth0) is
 10.0.0.* and spans from 1 (the frontend)  to 6 (5 dual slaves): the
 nodes are called baxeico** (from 00 to 05). Gmond runs on all the
 nodes as well ganglia-php-rrd . Gexec and authd run well.
 
 Gstat lists only the front end (!! with the name of the eth1 net,
 which is on the Real World!!). This happens also on my web page of
 ganglia.
 
 Before a crash of the frontend due to a kernel bug, maybe during MPI
 comm between nodes, all the nodes were listed as baxeico00-baxeico05.
 I've also added a route add -host 239.2.11.71 eth0 and the tcpdump
 listing is clear.
 
 What is broken when there is a sudden death of the frontend that
 doesn't restart during the next boot?

running route add -host 239.2.11.71 eth0 doesn't last between reboots.  
you need to add it to an init script to be run at boottime.

-matt




Re: [Ganglia-general] unable to survey all nodes

2002-05-23 Thread Joe Griffin

Marino,

marino vetuschi zuccolini wrote:

Hello to all
I've a dual frontend with two eth cards.
The internal net (eth0) is 10.0.0.* and spans from 1 (the frontend) to 6 
(5 dual slaves): the nodes are called baxeico** (from 00 to 05).

Gmond runs on all the nodes as well ganglia-php-rrd .
Gexec and authd run well.

Gstat lists only the front end (!! with the name of the eth1 net, which 
is on the Real World!!).

This happens also on my web page of ganglia.


I get the internal nodes by using --mcase_if in the initrd/gmond:

loadproc $GMOND --mcast_if=eth1

On RedHat systems, the line will probably look like:

daemon $GMOND --mcast_if=eth1

Regards,
Joe


Before a crash of the frontend due to a kernel bug, maybe during MPI 
comm between nodes, all the nodes were listed as baxeico00-baxeico05.
I've also added a route add -host 239.2.11.71 eth0 and the tcpdump 
listing is clear.


What is broken when there is a sudden death of the frontend that doesn't 
restart during the next boot?


Many thanks to all

m.

___

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general