I have updated the entry on the FAQ page with what I hope is some clearer commentary on this issue.
> -----Original Message----- > From: Bernard Li [mailto:bern...@vanhpc.org] > Sent: Thursday, November 18, 2010 1:32 AM > To: Bostjan Skufca > Cc: Louis Coilliot; ganglia-general@lists.sourceforge.net > Subject: Re: [Ganglia-general] restarting the gmond collector > node causes no data to be reported > > Thanks for the feedback guys. Would one of you like to edit the Wiki > and add more clarity to it? Please let me know if you run into any > issues with the edits (I think you just need a SF.net id to do so). > > Cheers, > > Bernard > > On Wed, Nov 17, 2010 at 8:14 PM, Bostjan Skufca > <bost...@a2o.si> wrote: > > It definitely is unclear. > > I, for one, did have a bit (large bit:) of a problem with this. If > > only faq would say "...or when graphs are not updated" or something > > similar. > > > > b. > > > > > > On 17 November 2010 22:36, Cameron L. Spitzer > <cspit...@nvidia.com> wrote: > >> > >> Just out of curiosity, I followed the link in Bernard's message. > >> I didn't find anything related to Russell's question. > >> I followed the link to Current Release Notes, and searched > the page for send_metadata_interval, which is cheating, > >> because I would only have Russell's question if I didn't > know about send_metadata_interval. > >> > >> Then I followed the link to Ganglia FAQs. > >> Someone who already understood Ganglia pretty well might > make the connection between > >> Russells's question " ... no metrics are reported anymore" > and the FAQ "Sometimes graphs don't show up for hosts." > >> I doubt a newcomer would see it. That's unclear. > >> > >> > >> -Cameron in Los Gatos > >> > >> > >> > >> Bernard Li wrote: > >> > >> Hello: > >> > >> This is actually documented in both the release notes and > the FAQs in our Wiki: > >> > >> http://sourceforge.net/apps/trac/ganglia/wiki > >> > >> Please let us know if anything is unclear. > >> > >> Thanks, > >> > >> Bernard > >> > >> On Wed, Nov 17, 2010 at 1:14 PM, Louis Coilliot > <louis.coill...@think.fr> wrote: > >> > >> > >> Hello, this behaviour is reported from time to time with unicast :) > >> > >> Use: > >> send_metadata_interval = 600 > >> > >> (600, for example) > >> > >> on the gmond.conf for your nodes. > >> > >> The metrics should get back after a while. > >> > >> Louis > >> > >> 2010/11/17 Auld, Russell G CSC <russell.a...@pw.utc.com>: > >> > >> > >> I'm running ganglia 3.1.7 on some RHEL computers. > >> I have four separate clusters configured, with each one running in > >> unicast mode. Each cluster uses a different port number in their > >> gmond.conf files. > >> > >> Here's one example: > >> > >> udp_send_channel { > >> #bind_hostname = yes # Highly recommended, soon to be default. > >> # This option tells gmond to use a > source address > >> # that resolves to the machine's hostname. > >> Without > >> # this, the metrics may appear to > come from any > >> # interface and the DNS names associated with > >> # those IPs will be used to create the RRDs. > >> host = 192.168.115.100 # the gmond "collector" for this cluster > >> port = 8655 > >> ttl = 1 > >> } > >> > >> /* You can specify as many udp_recv_channels as you like > as well. */ > >> udp_recv_channel { > >> port = 8655 > >> } > >> > >> /* You can specify as many tcp_accept_channels as you like to share > >> an xml description of the state of the cluster */ > >> tcp_accept_channel { > >> port = 8655 > >> } > >> > >> The above configuration is installed on each node in the cluster, > >> including the "collector" node. The collector node is > identified in the > >> gmetad.conf file as a data source. > >> > >> The problem I'm having is that if the "collector" node's gmond is > >> restarted for whatever reason, no metrics are reported > anymore for the > >> cluster. The front-end still shows the correct number of > hosts, and they > >> all appear "up", but there just isn't any data flowing. If > I restart > >> gmond on all the nodes in the cluster, things will all work again. > >> Is this a bug? Or is there something wrong with the > configuration above? > >> > >> If I telnet to one of the nodes in the cluster using the > specified port, > >> I get output, but there's no data in it as shown below. If > I use the web > >> page to show the host report for the node, it reports that > it's up, and > >> that it last reported 15 seconds ago (or less), but there > are no metrics > >> shown on the page. > >> > >> > >> [h...@derp] ~ 133> telnet 192.168.115.164 8655 > >> Trying 192.168.115.164... > >> Connected to 192.168.115.164. > >> Escape character is '^]'. > >> <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> > >> <!DOCTYPE GANGLIA_XML [ > >> <!ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)*> > >> <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED> > >> <!ATTLIST GANGLIA_XML SOURCE CDATA #REQUIRED> > >> <!ELEMENT GRID (CLUSTER | GRID | HOSTS | METRICS)*> > >> <!ATTLIST GRID NAME CDATA #REQUIRED> > >> <!ATTLIST GRID AUTHORITY CDATA #REQUIRED> > >> <!ATTLIST GRID LOCALTIME CDATA #IMPLIED> > >> <!ELEMENT CLUSTER (HOST | HOSTS | METRICS)*> > >> <!ATTLIST CLUSTER NAME CDATA #REQUIRED> > >> <!ATTLIST CLUSTER OWNER CDATA #IMPLIED> > >> <!ATTLIST CLUSTER LATLONG CDATA #IMPLIED> > >> <!ATTLIST CLUSTER URL CDATA #IMPLIED> > >> <!ATTLIST CLUSTER LOCALTIME CDATA #REQUIRED> > >> <!ELEMENT HOST (METRIC)*> > >> <!ATTLIST HOST NAME CDATA #REQUIRED> > >> <!ATTLIST HOST IP CDATA #REQUIRED> > >> <!ATTLIST HOST LOCATION CDATA #IMPLIED> > >> <!ATTLIST HOST REPORTED CDATA #REQUIRED> > >> <!ATTLIST HOST TN CDATA #IMPLIED> > >> <!ATTLIST HOST TMAX CDATA #IMPLIED> > >> <!ATTLIST HOST DMAX CDATA #IMPLIED> > >> <!ATTLIST HOST GMOND_STARTED CDATA #IMPLIED> > >> <!ELEMENT METRIC (EXTRA_DATA*)> > >> <!ATTLIST METRIC NAME CDATA #REQUIRED> > >> <!ATTLIST METRIC VAL CDATA #REQUIRED> > >> <!ATTLIST METRIC TYPE (string | int8 | uint8 | int16 > | uint16 | > >> int32 | uint32 | float | double | timestamp) #REQUIRED> > >> <!ATTLIST METRIC UNITS CDATA #IMPLIED> > >> <!ATTLIST METRIC TN CDATA #IMPLIED> > >> <!ATTLIST METRIC TMAX CDATA #IMPLIED> > >> <!ATTLIST METRIC DMAX CDATA #IMPLIED> > >> <!ATTLIST METRIC SLOPE (zero | positive | negative | both | > >> unspecified) #IMPLIED> > >> <!ATTLIST METRIC SOURCE (gmond) 'gmond'> > >> <!ELEMENT EXTRA_DATA (EXTRA_ELEMENT*)> > >> <!ELEMENT EXTRA_ELEMENT EMPTY> > >> <!ATTLIST EXTRA_ELEMENT NAME CDATA #REQUIRED> > >> <!ATTLIST EXTRA_ELEMENT VAL CDATA #REQUIRED> > >> <!ELEMENT HOSTS EMPTY> > >> <!ATTLIST HOSTS UP CDATA #REQUIRED> > >> <!ATTLIST HOSTS DOWN CDATA #REQUIRED> > >> <!ATTLIST HOSTS SOURCE (gmond | gmetad) #REQUIRED> > >> <!ELEMENT METRICS (EXTRA_DATA*)> > >> <!ATTLIST METRICS NAME CDATA #REQUIRED> > >> <!ATTLIST METRICS SUM CDATA #REQUIRED> > >> <!ATTLIST METRICS NUM CDATA #REQUIRED> > >> <!ATTLIST METRICS TYPE (string | int8 | uint8 | int16 > | uint16 | > >> int32 | uint32 | float | double | timestamp) #REQUIRED> > >> <!ATTLIST METRICS UNITS CDATA #IMPLIED> > >> <!ATTLIST METRICS SLOPE (zero | positive | negative | both | > >> unspecified) #IMPLIED> > >> <!ATTLIST METRICS SOURCE (gmond) 'gmond'> > >> ]> > >> <GANGLIA_XML VERSION="3.1.7" SOURCE="gmond"> > >> <CLUSTER NAME="DERP" LOCALTIME="1290018259" OWNER="HERP" > >> LATLONG="unspecified" URL="unspecified"> > >> </CLUSTER> > >> </GANGLIA_XML> > >> Connection closed by foreign host. > >> > >> > >> > >> > -------------------------------------------------------------- > ---------------- > >> Beautiful is writing same markup. [huh?] > >> > >> > >> ________________________________ > >> This email message is for the sole use of the intended > recipient(s) and may contain confidential information. Any > unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please > contact the sender by reply email and destroy all copies of > the original message. > >> ________________________________ > >> > >> > -------------------------------------------------------------- > ---------------- > >> Beautiful is writing same markup. Internet Explorer 9 supports > >> standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > >> Spend less time writing and rewriting code and more time > creating great > >> experiences on the web. Be a part of the beta today > >> http://p.sf.net/sfu/msIE9-sfdev2dev > >> _______________________________________________ > >> Ganglia-general mailing list > >> Ganglia-general@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/ganglia-general > >> > > > > -------------------------------------------------------------- > ---------------- > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time > creating great > experiences on the web. Be a part of the beta today > http://p.sf.net/sfu/msIE9-sfdev2dev > _______________________________________________ > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-general > ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general