Well things blew up ~184 hosts. The web interface shows a random number of hosts down each refresh, although sometimes there are all up. It reports just ~1 second to download and process the XML: "Downloading and parsing ganglia's XML tree took 0.9751s." So I don't think timeouts are the problem. A telnet to 8649 produces the XLM immediately. Could this be the point where I need start using a RAM based partition or could it be something else. Is sflow so much better I should consider using it? Would multiple gmond's, say one per rack, and listing them all in gmetad be a solution? At this point I am not sure of the next step and I really appreciate the help the list have given me so far.
>Hi Mark, > >I assume cnode340 is the head node that all ~340 other gmond's send their data >to. If so, you could reduce >the amount of redundant metadata flying around by >increasing "send_metadata_interval" to 120 seconds or >higher. That is correct, cnode340 is the head node for ganglia. I have increased the "send metadata interval" to 120 seconds and have 100 nodes reporting at this point and it seems pretty smooth. I am going to add the others ~50 at a time. >Also, I suspect that if you telnet to port 8649 on your head node it will take >a while to respond because >it's busy processing incoming UDP metrics. If it >takes more than 10 seconds to respond on a regular basis >then gmetad will >timeout [1]. So far, with the 100 I have the response is an instant dump of the XML. >Try deploying a recently patched version of gmond [2] to the head node which >is now multi-threaded and see >if that fixes the problem. It starts a separate >thread for responding to XML metric requests and should >respond immediately >while the main thread is still processing metrics. I am running: gmond 3.4.0 gmetad 3.4.0 Ganglia Web Frontend version 3.5.2 Would I need to patch gmond at this version? <SNIP> ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general