Well, this is a new one - at least for me.

One of our clusters was rebooted last week, due to a physical
relocation. Now the ganglia XML data doesn't contain any mention of
the cluster frontend, even though gmond is running fine and responding
to the XML data port:

    nixon $ telnet grendel 8649|grep -i "host name"|cut -c -60
    Connection closed by foreign host.
          <!ATTLIST HOST NAME CDATA #REQUIRED>
    <HOST NAME="g10" IP="192.168.1.10" REPORTED="1047377023" TN=
    <HOST NAME="g11" IP="192.168.1.11" REPORTED="1047377026" TN=
    <HOST NAME="g12" IP="192.168.1.12" REPORTED="1047377029" TN=
    <HOST NAME="g13" IP="192.168.1.13" REPORTED="1047377026" TN=
    <HOST NAME="g1" IP="192.168.1.1" REPORTED="1047377032" TN="0
    <HOST NAME="g2" IP="192.168.1.2" REPORTED="1047377029" TN="3
    <HOST NAME="g16" IP="192.168.1.16" REPORTED="1047377022" TN=
    <HOST NAME="g4" IP="192.168.1.4" REPORTED="1047377025" TN="7
    <HOST NAME="g5" IP="192.168.1.5" REPORTED="1047377023" TN="9
    <HOST NAME="g6" IP="192.168.1.6" REPORTED="1047377031" TN="1
    <HOST NAME="g8" IP="192.168.1.8" REPORTED="1047377028" TN="4
    <HOST NAME="g9" IP="192.168.1.9" REPORTED="1047377022" TN="1
    nixon $

The frontend used to turn up as "g0".

The same behaviour is presented by ganglia 2.5.1 and 2.5.3. I've run
gmond for a while with debug enabled, but nothing in the output seems
alarming to me. Anyone who wants to take a look can find the log at:

  http://www.nsc.liu.se/~nixon/tmp/ganglia.log

What blindingly obvious mistake am I making here?

-- 
Leif Nixon                                    Systems expert
------------------------------------------------------------
National Supercomputer Centre           Linkoping University
------------------------------------------------------------

Reply via email to