Re: [Ganglia-general] Unable to see nodes
Hi Bernard, Good call. Switching to unicast works perfectly. Thanks for the suggestion. Eric On 07/29/2012 04:13 PM, Bernard Li wrote: > Hi Eric: > > Just try setting up unicast between the head node and one of the other > hosts, and see if that works. > > Regards, > > Bernard > > On Sun, Jul 29, 2012 at 9:20 AM, Eric Dow wrote: >> Thanks for the quick response. >> >> I am configured to run in multicast (the default). My head node and all >> compute nodes are connected to a single router. I don't know if there is a >> way to make sure that multicast is set up properly. One thing that I checked >> was this (from the ganglia FAQ): >> >> Confirm that UDP connections can be established between the gmetad and >> gmond(or gmond and other gmond's for multicast purposes) by running nc -u -l >> 8653 on the host in question, then echo "hello"|nc -u 8653 from >> the gmetad or another gmond. >> >> This worked. I saw "hello" appear on my compute node when I did this. >> >> My gmetad.conf is the same as the standard file provided with ganglia. The >> only modification is the data_source line: >> >> data_source "ACDL Cluster" localhost >> >> Finally, here's my gmond.conf: >> >> /* This configuration is as close to 2.5.x default behavior as possible >> The values closely match ./gmond/metric.h definitions in 2.5.x */ >> globals { >>daemonize = yes >>setuid = yes >>user = nobody >>debug_level = 0 >>max_udp_msg_len = 1472 >>mute = no >>deaf = no >>allow_extra_data = yes >>host_dmax = 86400 /*secs. Expires (removes from web interface) hosts in 1 >> day */ >>host_tmax = 20 /*secs */ >>cleanup_threshold = 300 /*secs */ >>gexec = no >>send_metadata_interval = 0 /*secs */ >> } >> >> /* >> * The cluster attributes specified will be used as part of the >> * tag that will wrap all hosts collected by this instance. >> */ >> cluster { >>name = "ACDL Cluster" >>owner = "ACDL" >>latlong = "unspecified" >>url = "unspecified" >> } >> >> /* The host section describes attributes of the host, like the location */ >> host { >>location = "ACDL" >> } >> >> /* Feel free to specify as many udp_send_channels as you like. Gmond >> used to only support having a single channel */ >> udp_send_channel { >>#bind_hostname = yes # Highly recommended, soon to be default. >> # This option tells gmond to use a source address >> # that resolves to the machine's hostname. Without >> # this, the metrics may appear to come from any >> # interface and the DNS names associated with >> # those IPs will be used to create the RRDs. >>mcast_join = 239.2.11.71 >>port = 8649 >>ttl = 1 >> } >> >> /* You can specify as many udp_recv_channels as you like as well. */ >> udp_recv_channel { >>mcast_join = 239.2.11.71 >>port = 8649 >>bind = 239.2.11.71 >>retry_bind = true >> } >> >> /* You can specify as many tcp_accept_channels as you like to share >> an xml description of the state of the cluster */ >> tcp_accept_channel { >>port = 8649 >> } >> >> /* Channel to receive sFlow datagrams */ >> #udp_recv_channel { >> # port = 6343 >> #} >> >> /* Optional sFlow settings */ >> #sflow { >> # udp_port = 6343 >> # accept_vm_metrics = yes >> # accept_jvm_metrics = yes >> # multiple_jvm_instances = no >> # accept_http_metrics = yes >> # multiple_http_instances = no >> # accept_memcache_metrics = yes >> # multiple_memcache_instances = no >> #} >> >> /* Each metrics module that is referenced by gmond must be specified and >> loaded. If the module has been statically linked with gmond, it does >> not require a load path. However all dynamically loadable modules must >> include a load path. */ >> modules { >>module { >> name = "core_metrics" >>} >>module { >> name = "cpu_module" >> path = "modcpu.so" >>} >>module { >> name = "disk_module" >> path = "moddisk.so" >>} >>module { >> name = "load_module" >> path = "modload.so" >>} >>module { >> name = "mem_module" >> path = "modmem.so" >>} >>module { >> name = "net_module" >> path = "modnet.so" >>} >>module { >> name = "proc_module" >> path = "modproc.so" >>} >>module { >> name = "sys_module" >> path = "modsys.so" >>} >> } >> >> /* The old internal 2.5.x metric array has been replaced by the following >> collection_group directives. What follows is the default behavior for >> collecting and sending metrics that is as close to 2.5.x behavior as >> possible. */ >> >> /* This collection group will cause a heartbeat (or beacon) to be sent every >> 20 seconds. In the heartbeat is the GMOND_STARTED data which expresses >> the age of the running gmond. */ >> collection_group { >>collect_once = yes >>time_threshold = 20 >>
Re: [Ganglia-general] Unable to see nodes
Hi Eric: Just try setting up unicast between the head node and one of the other hosts, and see if that works. Regards, Bernard On Sun, Jul 29, 2012 at 9:20 AM, Eric Dow wrote: > Thanks for the quick response. > > I am configured to run in multicast (the default). My head node and all > compute nodes are connected to a single router. I don't know if there is a > way to make sure that multicast is set up properly. One thing that I checked > was this (from the ganglia FAQ): > > Confirm that UDP connections can be established between the gmetad and > gmond(or gmond and other gmond's for multicast purposes) by running nc -u -l > 8653 on the host in question, then echo "hello"|nc -u 8653 from > the gmetad or another gmond. > > This worked. I saw "hello" appear on my compute node when I did this. > > My gmetad.conf is the same as the standard file provided with ganglia. The > only modification is the data_source line: > > data_source "ACDL Cluster" localhost > > Finally, here's my gmond.conf: > > /* This configuration is as close to 2.5.x default behavior as possible >The values closely match ./gmond/metric.h definitions in 2.5.x */ > globals { > daemonize = yes > setuid = yes > user = nobody > debug_level = 0 > max_udp_msg_len = 1472 > mute = no > deaf = no > allow_extra_data = yes > host_dmax = 86400 /*secs. Expires (removes from web interface) hosts in 1 > day */ > host_tmax = 20 /*secs */ > cleanup_threshold = 300 /*secs */ > gexec = no > send_metadata_interval = 0 /*secs */ > } > > /* > * The cluster attributes specified will be used as part of the > * tag that will wrap all hosts collected by this instance. > */ > cluster { > name = "ACDL Cluster" > owner = "ACDL" > latlong = "unspecified" > url = "unspecified" > } > > /* The host section describes attributes of the host, like the location */ > host { > location = "ACDL" > } > > /* Feel free to specify as many udp_send_channels as you like. Gmond >used to only support having a single channel */ > udp_send_channel { > #bind_hostname = yes # Highly recommended, soon to be default. ># This option tells gmond to use a source address ># that resolves to the machine's hostname. Without ># this, the metrics may appear to come from any ># interface and the DNS names associated with ># those IPs will be used to create the RRDs. > mcast_join = 239.2.11.71 > port = 8649 > ttl = 1 > } > > /* You can specify as many udp_recv_channels as you like as well. */ > udp_recv_channel { > mcast_join = 239.2.11.71 > port = 8649 > bind = 239.2.11.71 > retry_bind = true > } > > /* You can specify as many tcp_accept_channels as you like to share >an xml description of the state of the cluster */ > tcp_accept_channel { > port = 8649 > } > > /* Channel to receive sFlow datagrams */ > #udp_recv_channel { > # port = 6343 > #} > > /* Optional sFlow settings */ > #sflow { > # udp_port = 6343 > # accept_vm_metrics = yes > # accept_jvm_metrics = yes > # multiple_jvm_instances = no > # accept_http_metrics = yes > # multiple_http_instances = no > # accept_memcache_metrics = yes > # multiple_memcache_instances = no > #} > > /* Each metrics module that is referenced by gmond must be specified and >loaded. If the module has been statically linked with gmond, it does >not require a load path. However all dynamically loadable modules must >include a load path. */ > modules { > module { > name = "core_metrics" > } > module { > name = "cpu_module" > path = "modcpu.so" > } > module { > name = "disk_module" > path = "moddisk.so" > } > module { > name = "load_module" > path = "modload.so" > } > module { > name = "mem_module" > path = "modmem.so" > } > module { > name = "net_module" > path = "modnet.so" > } > module { > name = "proc_module" > path = "modproc.so" > } > module { > name = "sys_module" > path = "modsys.so" > } > } > > /* The old internal 2.5.x metric array has been replaced by the following >collection_group directives. What follows is the default behavior for >collecting and sending metrics that is as close to 2.5.x behavior as >possible. */ > > /* This collection group will cause a heartbeat (or beacon) to be sent every >20 seconds. In the heartbeat is the GMOND_STARTED data which expresses >the age of the running gmond. */ > collection_group { > collect_once = yes > time_threshold = 20 > metric { > name = "heartbeat" > } > } > > /* This collection group will send general info about this host every >1200 secs. >This information doesn't change between reboots and is only collected >once. */ > collection_group { > collect_once = yes > time_threshold = 1200 > metric { > name = "cpu_num" > title = "CPU Count" > } > metric { > na
Re: [Ganglia-general] Unable to see nodes
Thanks for the quick response. I am configured to run in multicast (the default). My head node and all compute nodes are connected to a single router. I don't know if there is a way to make sure that multicast is set up properly. One thing that I checked was this (from the ganglia FAQ): Confirm that UDP connections can be established between the gmetad and gmond(or gmond and other gmond's for multicast purposes) by running nc -u -l 8653 on the host in question, then echo "hello"|nc -u 8653 from the gmetad or another gmond. This worked. I saw "hello" appear on my compute node when I did this. My gmetad.conf is the same as the standard file provided with ganglia. The only modification is the data_source line: data_source "ACDL Cluster" localhost Finally, here's my gmond.conf: /* This configuration is as close to 2.5.x default behavior as possible The values closely match ./gmond/metric.h definitions in 2.5.x */ globals { daemonize = yes setuid = yes user = nobody debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no allow_extra_data = yes host_dmax = 86400 /*secs. Expires (removes from web interface) hosts in 1 day */ host_tmax = 20 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no send_metadata_interval = 0 /*secs */ } /* * The cluster attributes specified will be used as part of the * tag that will wrap all hosts collected by this instance. */ cluster { name = "ACDL Cluster" owner = "ACDL" latlong = "unspecified" url = "unspecified" } /* The host section describes attributes of the host, like the location */ host { location = "ACDL" } /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { #bind_hostname = yes # Highly recommended, soon to be default. # This option tells gmond to use a source address # that resolves to the machine's hostname. Without # this, the metrics may appear to come from any # interface and the DNS names associated with # those IPs will be used to create the RRDs. mcast_join = 239.2.11.71 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.71 port = 8649 bind = 239.2.11.71 retry_bind = true } /* You can specify as many tcp_accept_channels as you like to share an xml description of the state of the cluster */ tcp_accept_channel { port = 8649 } /* Channel to receive sFlow datagrams */ #udp_recv_channel { # port = 6343 #} /* Optional sFlow settings */ #sflow { # udp_port = 6343 # accept_vm_metrics = yes # accept_jvm_metrics = yes # multiple_jvm_instances = no # accept_http_metrics = yes # multiple_http_instances = no # accept_memcache_metrics = yes # multiple_memcache_instances = no #} /* Each metrics module that is referenced by gmond must be specified and loaded. If the module has been statically linked with gmond, it does not require a load path. However all dynamically loadable modules must include a load path. */ modules { module { name = "core_metrics" } module { name = "cpu_module" path = "modcpu.so" } module { name = "disk_module" path = "moddisk.so" } module { name = "load_module" path = "modload.so" } module { name = "mem_module" path = "modmem.so" } module { name = "net_module" path = "modnet.so" } module { name = "proc_module" path = "modproc.so" } module { name = "sys_module" path = "modsys.so" } } /* The old internal 2.5.x metric array has been replaced by the following collection_group directives. What follows is the default behavior for collecting and sending metrics that is as close to 2.5.x behavior as possible. */ /* This collection group will cause a heartbeat (or beacon) to be sent every 20 seconds. In the heartbeat is the GMOND_STARTED data which expresses the age of the running gmond. */ collection_group { collect_once = yes time_threshold = 20 metric { name = "heartbeat" } } /* This collection group will send general info about this host every 1200 secs. This information doesn't change between reboots and is only collected once. */ collection_group { collect_once = yes time_threshold = 1200 metric { name = "cpu_num" title = "CPU Count" } metric { name = "cpu_speed" title = "CPU Speed" } metric { name = "mem_total" title = "Memory Total" } /* Should this be here? Swap can be added/removed between reboots. */ metric { name = "swap_total" title = "Swap Space Total" } metric { name = "boottime" title = "Last Boot Time" } metric { name = "machine_type" title = "Machine Type" } metric { name = "os_name" title = "Operating System" } metric { name = "os_release"
Re: [Ganglia-general] Unable to see nodes
On Sat, Jul 28, 2012 at 11:53 AM, Eric Dow wrote: > Hi, > > I am running a cluster with Ubuntu 12.04 and I installed gangila 3.4.0 > from the tarball provided on the site. > > I installed gmetad and gmond onto my head node, and gmond onto each of the > compute nodes. I am able to see my head node just fine from the ganglia > interface, and can view the various metrics for the head node. However, I > cannot see any of the nodes, even though I am running gmond on all of them. > The gmond.conf file is identical on the head node and all the nodes. > This sounds like you are running multicast - can you confirm that you're running in multicast mode? If you are running in unicast mode then the config for all the nodes probably shouldn't be identical. If you aren't sure then yes, you should post your gmond.conf. It would also be helpful to see what you have configured in gmetad for your clusters. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Unable to see nodes
Hi Eric: By default gmond is configured to run in multicast mode. If you can only see your head node, chances are multicast does not work in your environment. When you nc to port 8651, you should see XML of not just your head node, but other hosts and their corresponding metrics as well. And 8651 is the correct port for gmetad non-interactive port, not sure who put 8650 there because that's wrong. Try setting up unicast and see if that works. Cheers, Bernard On Sat, Jul 28, 2012 at 10:53 AM, Eric Dow wrote: > Hi, > > I am running a cluster with Ubuntu 12.04 and I installed gangila 3.4.0 from > the tarball provided on the site. > > I installed gmetad and gmond onto my head node, and gmond onto each of the > compute nodes. I am able to see my head node just fine from the ganglia > interface, and can view the various metrics for the head node. However, I > cannot see any of the nodes, even though I am running gmond on all of them. > The gmond.conf file is identical on the head node and all the nodes. > > I tried all the debug steps found here > http://sourceforge.net/apps/trac/ganglia/wiki/FAQ and everything works as it > should. The only difference was that I need to use nc 8651 rather > than 8650 to see the XML data from the head node. Other than, everything > seems to be working. > > I can post my gmond.conf file if that will help, or any other info for that > matter. I think ganglia is amazing and would really like to get it up and > running. Any suggestions would be appreciated. > > Thanks > > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > ___ > Ganglia-general mailing list > Ganglia-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ganglia-general > -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] Unable to see nodes
Hi, I am running a cluster with Ubuntu 12.04 and I installed gangila 3.4.0 from the tarball provided on the site. I installed gmetad and gmond onto my head node, and gmond onto each of the compute nodes. I am able to see my head node just fine from the ganglia interface, and can view the various metrics for the head node. However, I cannot see any of the nodes, even though I am running gmond on all of them. The gmond.conf file is identical on the head node and all the nodes. I tried all the debug steps found here http://sourceforge.net/apps/trac/ganglia/wiki/FAQ and everything works as it should. The only difference was that I need to use nc 8651 rather than 8650 to see the XML data from the head node. Other than, everything seems to be working. I can post my gmond.conf file if that will help, or any other info for that matter. I think ganglia is amazing and would really like to get it up and running. Any suggestions would be appreciated. Thanks -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general