Hi all, I have found quite a few references in the archives to this general problem , but nothing that actually has helped me solve the problem...
My error, from /var/log/syslog: Apr 20 10:43:03 selenium-s2 /usr/sbin/gmetad[5551]: Process XML (HarnessLive - Selenium): XML_ParseBuffer() error at line 49: not well-formed My environment: ubuntu/edgy: -- setup 1: ganglia 3.0.4, compiled from .tar.gz -- setup 2: ganglia-monitor + gmetad versions 2.5.7-3 (ubuntu packages, from archive.ubuntu.com) These machines are dual-nics, each with one eth configured to a 192.168network that has quite a few other machines on it -- these other machines are not yet running gmond, but will. These machines are meant to be replicated services, so they'll both be running gmond + gmetad + web front end at some point. On both machines, regardless of the setup, I have the same issue, so I'm fairly sure this is only a configuration problem. My XML output from gmond looks great -- exactly what I put in the gmond.conffile. My XML output from gmetad is of course the problem. Discarding the DTD, the XML is: <GANGLIA_XML VERSION="2.5.7" SOURCE="gmetad"> <GRID NAME="SeleniumFarm" AUTHORITY="http://munged/mon/ganglia/" LOCALTIME="1177089707"> </GRID> </GANGLIA_XML> Line 49 corresponds to the line starting with <GRID . I ran xmllint, as specified: xmllint --valid --noout gmetad.out And got no errors; it exited silently. The Ganglia page recommends that I email this list, at this point, after having searched the archives, checked xmllint, etc. My configuration files are somewhat different on the two machines, as they came from different sources, but here's the summary from the 3.0.4 source version: ----- gmond.conf ---- cluster { name = "FooBar - Selenium" owner = "ABC" latlong = "unspecified" url = "http://localhost/munged" } udp_send_channel { mcast_join = 239.2.11.71 mcast_if = eth3 host = 192.168.1.1 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.71 mcast_if = eth3 port = 8649 bind = 239.2.11.71 } ----- and so forth ----- Note: gmond seems to be quite happy with the multicast ---- gmetad.conf ---- data_source "FooBar - Selenium" 192.168.1.1:8649 # this configuration had no success either: # data_source "FooBar - Selenium" localhost gridname "SeleniumFarm" authority "http://localhost/ganglia" ----- and so forth, all else is defaults -- no trusted hosts are configured (as localhost is always trusted). Am not sure if I have to specify the 192.168.1.1 ip as a trusted host. What concerns me is that the XML I'm getting from gmetad has a GRID configuration that does not apparently correspond to any CLUSTER. I haven't seen a lot of actual XML output posted in the archives, and I haven't seen anything that gives me something to compare this with. Should my GRID element be populated with CLUSTER elements? The documentation indicates that gmetad 'decides' if the data source is a grid or a cluster, and it should be a cluster (it should be listening to the gmond on the localhost/eth3), but the XML sure looks as though gmetad thinks my datasource is s GRID. One piece of the documentation mentioned that my cluster would be 'wrapped' in a grid -- but I think my cluster is being lost. To reiterate, the XML from gmond looks great -- stats everywhere, cluster name specified accordingly, etc. Any help on this would be great -- as well as any descriptions of exactly what the XML from gmetad *should* look like for a single-gmond data source configuration. -Jen