I have updated the entry on the FAQ page with what I hope is some clearer 
commentary on this issue. 

> -----Original Message-----
> From: Bernard Li [mailto:bern...@vanhpc.org] 
> Sent: Thursday, November 18, 2010 1:32 AM
> To: Bostjan Skufca
> Cc: Louis Coilliot; ganglia-general@lists.sourceforge.net
> Subject: Re: [Ganglia-general] restarting the gmond collector 
> node causes no data to be reported
> 
> Thanks for the feedback guys.  Would one of you like to edit the Wiki
> and add more clarity to it?  Please let me know if you run into any
> issues with the edits (I think you just need a SF.net id to do so).
> 
> Cheers,
> 
> Bernard
> 
> On Wed, Nov 17, 2010 at 8:14 PM, Bostjan Skufca 
> <bost...@a2o.si> wrote:
> > It definitely is unclear.
> > I, for one, did have a bit (large bit:) of a problem with this. If
> > only faq would say "...or when graphs are not updated" or something
> > similar.
> >
> > b.
> >
> >
> > On 17 November 2010 22:36, Cameron L. Spitzer 
> <cspit...@nvidia.com> wrote:
> >>
> >> Just out of curiosity, I followed the link in Bernard's message.
> >> I didn't find anything related to Russell's question.
> >> I followed the link to Current Release Notes, and searched 
> the page for send_metadata_interval, which is cheating,
> >> because I would only have Russell's question if I didn't 
> know about send_metadata_interval.
> >>
> >> Then I followed the link to Ganglia FAQs.
> >> Someone who already understood Ganglia pretty well might 
> make the connection between
> >> Russells's question " ... no metrics are reported anymore" 
> and the FAQ "Sometimes graphs don't show up for hosts."
> >> I doubt a newcomer would see it.  That's unclear.
> >>
> >>
> >> -Cameron in Los Gatos
> >>
> >>
> >>
> >> Bernard Li wrote:
> >>
> >> Hello:
> >>
> >> This is actually documented in both the release notes and 
> the FAQs in our Wiki:
> >>
> >> http://sourceforge.net/apps/trac/ganglia/wiki
> >>
> >> Please let us know if anything is unclear.
> >>
> >> Thanks,
> >>
> >> Bernard
> >>
> >> On Wed, Nov 17, 2010 at 1:14 PM, Louis Coilliot 
> <louis.coill...@think.fr> wrote:
> >>
> >>
> >> Hello, this behaviour is reported from time to time with unicast :)
> >>
> >> Use:
> >> send_metadata_interval = 600
> >>
> >> (600, for example)
> >>
> >> on the gmond.conf for your nodes.
> >>
> >> The metrics should get back after a while.
> >>
> >> Louis
> >>
> >> 2010/11/17 Auld, Russell G           CSC <russell.a...@pw.utc.com>:
> >>
> >>
> >> I'm running ganglia 3.1.7 on some RHEL computers.
> >> I have four separate clusters configured, with each one running in
> >> unicast mode. Each cluster uses a different port number in their
> >> gmond.conf files.
> >>
> >> Here's one example:
> >>
> >> udp_send_channel {
> >>  #bind_hostname = yes # Highly recommended, soon to be default.
> >>                       # This option tells gmond to use a 
> source address
> >>                       # that resolves to the machine's hostname.
> >> Without
> >>                       # this, the metrics may appear to 
> come from any
> >>                       # interface and the DNS names associated with
> >>                       # those IPs will be used to create the RRDs.
> >>  host = 192.168.115.100  # the gmond "collector" for this cluster
> >>  port = 8655
> >>  ttl = 1
> >> }
> >>
> >> /* You can specify as many udp_recv_channels as you like 
> as well. */
> >> udp_recv_channel {
> >>  port = 8655
> >> }
> >>
> >> /* You can specify as many tcp_accept_channels as you like to share
> >>   an xml description of the state of the cluster */
> >> tcp_accept_channel {
> >>  port = 8655
> >> }
> >>
> >> The above configuration is installed on each node in the cluster,
> >> including the "collector" node. The collector node is 
> identified in the
> >> gmetad.conf file as a data source.
> >>
> >> The problem I'm having is that if the "collector" node's gmond is
> >> restarted for whatever reason, no metrics are reported 
> anymore for the
> >> cluster. The front-end still shows the correct number of 
> hosts, and they
> >> all appear "up", but there just isn't any data flowing. If 
> I restart
> >> gmond on all the nodes in the cluster, things will all work again.
> >> Is this a bug? Or is there something wrong with the 
> configuration above?
> >>
> >> If I telnet to one of the nodes in the cluster using the 
> specified port,
> >> I get output, but there's no data in it as shown below. If 
> I use the web
> >> page to show the host report for the node, it reports that 
> it's up, and
> >> that it last reported 15 seconds ago (or less), but there 
> are no metrics
> >> shown on the page.
> >>
> >>
> >> [h...@derp] ~ 133> telnet 192.168.115.164 8655
> >> Trying 192.168.115.164...
> >> Connected to 192.168.115.164.
> >> Escape character is '^]'.
> >> <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
> >> <!DOCTYPE GANGLIA_XML [
> >>   <!ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)*>
> >>      <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED>
> >>      <!ATTLIST GANGLIA_XML SOURCE CDATA #REQUIRED>
> >>   <!ELEMENT GRID (CLUSTER | GRID | HOSTS | METRICS)*>
> >>      <!ATTLIST GRID NAME CDATA #REQUIRED>
> >>      <!ATTLIST GRID AUTHORITY CDATA #REQUIRED>
> >>      <!ATTLIST GRID LOCALTIME CDATA #IMPLIED>
> >>   <!ELEMENT CLUSTER (HOST | HOSTS | METRICS)*>
> >>      <!ATTLIST CLUSTER NAME CDATA #REQUIRED>
> >>      <!ATTLIST CLUSTER OWNER CDATA #IMPLIED>
> >>      <!ATTLIST CLUSTER LATLONG CDATA #IMPLIED>
> >>      <!ATTLIST CLUSTER URL CDATA #IMPLIED>
> >>      <!ATTLIST CLUSTER LOCALTIME CDATA #REQUIRED>
> >>   <!ELEMENT HOST (METRIC)*>
> >>      <!ATTLIST HOST NAME CDATA #REQUIRED>
> >>      <!ATTLIST HOST IP CDATA #REQUIRED>
> >>      <!ATTLIST HOST LOCATION CDATA #IMPLIED>
> >>      <!ATTLIST HOST REPORTED CDATA #REQUIRED>
> >>      <!ATTLIST HOST TN CDATA #IMPLIED>
> >>      <!ATTLIST HOST TMAX CDATA #IMPLIED>
> >>      <!ATTLIST HOST DMAX CDATA #IMPLIED>
> >>      <!ATTLIST HOST GMOND_STARTED CDATA #IMPLIED>
> >>   <!ELEMENT METRIC (EXTRA_DATA*)>
> >>      <!ATTLIST METRIC NAME CDATA #REQUIRED>
> >>      <!ATTLIST METRIC VAL CDATA #REQUIRED>
> >>      <!ATTLIST METRIC TYPE (string | int8 | uint8 | int16 
> | uint16 |
> >> int32 | uint32 | float | double | timestamp) #REQUIRED>
> >>      <!ATTLIST METRIC UNITS CDATA #IMPLIED>
> >>      <!ATTLIST METRIC TN CDATA #IMPLIED>
> >>      <!ATTLIST METRIC TMAX CDATA #IMPLIED>
> >>      <!ATTLIST METRIC DMAX CDATA #IMPLIED>
> >>      <!ATTLIST METRIC SLOPE (zero | positive | negative | both |
> >> unspecified) #IMPLIED>
> >>      <!ATTLIST METRIC SOURCE (gmond) 'gmond'>
> >>   <!ELEMENT EXTRA_DATA (EXTRA_ELEMENT*)>
> >>   <!ELEMENT EXTRA_ELEMENT EMPTY>
> >>      <!ATTLIST EXTRA_ELEMENT NAME CDATA #REQUIRED>
> >>      <!ATTLIST EXTRA_ELEMENT VAL CDATA #REQUIRED>
> >>   <!ELEMENT HOSTS EMPTY>
> >>      <!ATTLIST HOSTS UP CDATA #REQUIRED>
> >>      <!ATTLIST HOSTS DOWN CDATA #REQUIRED>
> >>      <!ATTLIST HOSTS SOURCE (gmond | gmetad) #REQUIRED>
> >>   <!ELEMENT METRICS (EXTRA_DATA*)>
> >>      <!ATTLIST METRICS NAME CDATA #REQUIRED>
> >>      <!ATTLIST METRICS SUM CDATA #REQUIRED>
> >>      <!ATTLIST METRICS NUM CDATA #REQUIRED>
> >>      <!ATTLIST METRICS TYPE (string | int8 | uint8 | int16 
> | uint16 |
> >> int32 | uint32 | float | double | timestamp) #REQUIRED>
> >>      <!ATTLIST METRICS UNITS CDATA #IMPLIED>
> >>      <!ATTLIST METRICS SLOPE (zero | positive | negative | both |
> >> unspecified) #IMPLIED>
> >>      <!ATTLIST METRICS SOURCE (gmond) 'gmond'>
> >> ]>
> >> <GANGLIA_XML VERSION="3.1.7" SOURCE="gmond">
> >> <CLUSTER NAME="DERP" LOCALTIME="1290018259" OWNER="HERP"
> >> LATLONG="unspecified" URL="unspecified">
> >> </CLUSTER>
> >> </GANGLIA_XML>
> >> Connection closed by foreign host.
> >>
> >>
> >>
> >> 
> --------------------------------------------------------------
> ----------------
> >> Beautiful is writing same markup.  [huh?]
> >>
> >>
> >> ________________________________
> >> This email message is for the sole use of the intended 
> recipient(s) and may contain confidential information.  Any 
> unauthorized review, use, disclosure or distribution is 
> prohibited.  If you are not the intended recipient, please 
> contact the sender by reply email and destroy all copies of 
> the original message.
> >> ________________________________
> >>
> >> 
> --------------------------------------------------------------
> ----------------
> >> Beautiful is writing same markup. Internet Explorer 9 supports
> >> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> >> Spend less time writing and  rewriting code and more time 
> creating great
> >> experiences on the web. Be a part of the beta today
> >> http://p.sf.net/sfu/msIE9-sfdev2dev
> >> _______________________________________________
> >> Ganglia-general mailing list
> >> Ganglia-general@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/ganglia-general
> >>
> >
> 
> --------------------------------------------------------------
> ----------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> Spend less time writing and  rewriting code and more time 
> creating great
> experiences on the web. Be a part of the beta today
> http://p.sf.net/sfu/msIE9-sfdev2dev
> _______________________________________________
> Ganglia-general mailing list
> Ganglia-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
> 

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to