[Ganglia-developers] UDP / mute=yes and high cpu usage
Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- regards, Maciej Lasyk GPG key ID: FFA8AEEC GPG info: http://maciek.lasyk.info/gpg.txt pgpbLZri3QmtE.pgp Description: PGP signature -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] UDP / mute=yes and high cpu usage
Maciej, can you post top 100 lines or so of your config ie. with all the udp channels etc. Thanks On 02/07/2014 08:52 AM, Maciej Lasyk wrote: Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] UDP / mute=yes and high cpu usage
Sure, it's not that long so I'm posting it in-place: globals { daemonize = yes setuid = yes user = ganglia debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no allow_extra_data = yes host_dmax = 0 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no send_metadata_interval = 60 /*secs */ } cluster { name = somecluster owner = someowner latlong = unspecified url = unspecified } host { location = host } udp_send_channel { bind_hostname = yes port = 8649 ttl = 2 host = 192.168.1.23 } .. and here go modules / metrics On Fri, Feb 07, 2014 at 09:15:56AM -0500, Vladimir Vuksan wrote: Maciej, can you post top 100 lines or so of your config ie. with all the udp channels etc. Thanks On 02/07/2014 08:52 AM, Maciej Lasyk wrote: Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. [1]http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list [2]Ganglia-developers@lists.sourceforge.net [3]https://lists.sourceforge.net/lists/listinfo/ganglia-developers References Visible links 1. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk 2. mailto:Ganglia-developers@lists.sourceforge.net 3. https://lists.sourceforge.net/lists/listinfo/ganglia-developers /usr/bin/xdg-open: line 402: htmlview: command not found /usr/bin/xdg-open: line 402: firefox: command not found /usr/bin/xdg-open: line 402: mozilla: command not found /usr/bin/xdg-open: line 402: netscape: command not found -- -- pozdrawiam, Maciej Lasyk GPG key ID: FFA8AEEC GPG info: http://maciek.lasyk.info/gpg.txt pgpkzEkJhqmTF.pgp Description: PGP signature -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] UDP / mute=yes and high cpu usage
Set deaf=yes. Let me know if that lowers the CPU usage. Vladimir On 02/07/2014 09:59 AM, Maciej Lasyk wrote: Sure, it's not that long so I'm posting it in-place: globals { daemonize = yes setuid = yes user = ganglia debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no allow_extra_data = yes host_dmax = 0 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no send_metadata_interval = 60 /*secs */ } cluster { name = somecluster owner = someowner latlong = unspecified url = unspecified } host { location = host } udp_send_channel { bind_hostname = yes port = 8649 ttl = 2 host = 192.168.1.23 } .. and here go modules / metrics On Fri, Feb 07, 2014 at 09:15:56AM -0500, Vladimir Vuksan wrote: Maciej, can you post top 100 lines or so of your config ie. with all the udp channels etc. Thanks On 02/07/2014 08:52 AM, Maciej Lasyk wrote: Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. [1]http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list [2]Ganglia-developers@lists.sourceforge.net [3]https://lists.sourceforge.net/lists/listinfo/ganglia-developers References Visible links 1. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk 2. mailto:Ganglia-developers@lists.sourceforge.net 3. https://lists.sourceforge.net/lists/listinfo/ganglia-developers /usr/bin/xdg-open: line 402: htmlview: command not found /usr/bin/xdg-open: line 402: firefox: command not found /usr/bin/xdg-open: line 402: mozilla: command not found /usr/bin/xdg-open: line 402: netscape: command not found -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] UDP / mute=yes and high cpu usage
Oh yes - setting deaf=yes is the solution - I mentioned that in my first email (sorry that I was not very clear about that). My question however was - is that a bug or a feature? It can be tricky as in default configuration deaf=no and one could not simply get to the source of problem. In Monitoring with Ganglia book hovewer it is mentioned (page 22) in the picture that gmond nodes are deaf; but maybe (just maybe) gmond could not listen to anything if it has no recv_channels defined? :) On 02/07/2014 09:59 AM, Maciej Lasyk wrote: Sure, it's not that long so I'm posting it in-place: globals { daemonize = yes setuid = yes user = ganglia debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no allow_extra_data = yes host_dmax = 0 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no send_metadata_interval = 60 /*secs */ } cluster { name = somecluster owner = someowner latlong = unspecified url = unspecified } host { location = host } udp_send_channel { bind_hostname = yes port = 8649 ttl = 2 host = 192.168.1.23 } .. and here go modules / metrics On Fri, Feb 07, 2014 at 09:15:56AM -0500, Vladimir Vuksan wrote: Maciej, can you post top 100 lines or so of your config ie. with all the udp channels etc. Thanks On 02/07/2014 08:52 AM, Maciej Lasyk wrote: Hi guys, I've been struggling with very high cpu usage of my gmond daemons lately. I've been using UDP unicast topology. I just couldn't find the source of my problem. All my gmonds on my servers were generating ~99-100% of procs usage. stracing gmond processes revealed zounds of epools and gettimeofdays syscalls (like very many every second) and that was all. I tried to start from scratch - there was no problem when using default config (mutlicast topo, deaf/mute=no). So after playing a while I thought that _maybe_ deaf=no on gmond which is only sending data to some aggregator (only udp_send_channel, no rcv channels or tcp channels) generates some negative energy here. And that was it. My mistake was that I thought that when deaf=no is set than it doesn't matter as in UDP unicast we just don't listen - unless we have recv_channel configured. So I think that this could be a bug - not a feature. What's your thoughts on this? -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. [1]http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list [2]Ganglia-developers@lists.sourceforge.net [3]https://lists.sourceforge.net/lists/listinfo/ganglia-developers References Visible links 1. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk 2. mailto:Ganglia-developers@lists.sourceforge.net 3. https://lists.sourceforge.net/lists/listinfo/ganglia-developers /usr/bin/xdg-open: line 402: htmlview: command not found /usr/bin/xdg-open: line 402: firefox: command not found /usr/bin/xdg-open: line 402: mozilla: command not found /usr/bin/xdg-open: line 402: netscape: command not found -- -- pozdrawiam, Maciej Lasyk GPG key ID: FFA8AEEC GPG info: http://maciek.lasyk.info/gpg.txt pgpIVn3sZ3v_l.pgp Description: PGP signature -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] GSoC application started, more help needed
Please feel free to add potential project ideas here: https://github.com/ganglia/monitor-core/wiki/GSoC-2014-project-ideas For an example of how the project ideas are documented in other organisations, see these pages: https://wiki.debian.org/SummerOfCode2014/Projects http://community.apache.org/gsoc.html For those who have a Google account, please also register at http://www.google-melange.com and ask to be added as a mentor for the Ganglia organisation If you are willing to be a mentor (or work as part of a joint mentoring team, which makes things easier for everybody), please include your name and a link to your blog or Github profile or something on the wiki. These things will help Ganglia's chance of getting selected. -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] GSoC application started, more help needed
On 02/07/2014 09:46 PM, Daniel Pocock wrote: Please feel free to add potential project ideas here: https://github.com/ganglia/monitor-core/wiki/GSoC-2014-project-ideas Hi! There were several discussions on the list regarding what i will mention and i will reiterate the basic points in order to have some kind of definitive closure regarding these and see if there are worthy of doing (either in GSoC or not) I will refer only to gmond framework: 1. Adding a string to globals similar to hostname named something like host_uuid; it can contain either a fixed (overridden) uuid or some automatic approach can be chosen (later) (with sensible defaults like empty); this could pave the way for have uuid--metrics association instead of hostname--metrics 2. make cluster name be (also) a pool metric of the host; this could pave the way to have gmond aggregators (gmonds that gather data from devices in close network proximity but in different logical partitions (clusters)); something like this i think would be useful in clouds or distributed computing associations like the grid. I imagine/hope that these addons will have no impact on gmetad and are completely backward compatible. So, what the experts think? Thank you for taking this into consideration, Adrian smime.p7s Description: S/MIME Cryptographic Signature -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers