Re: [Ganglia-developers] Gmetad bottlenecks

Devon H. O'Dell Sat, 07 Dec 2013 05:43:50 -0800

2013/12/6 Nicholas Satterly <nfsatte...@gmail.com>:
> Excellent work, Devon. A few comments...
>
> What happens when the RRD TCP receive queue fills up and the socket blocks
> or if the TCP socket is closed by RRDcached for some reason? Does each
> gmetad data thread block/time out/retry the connection?


On connection errors, the threads retry once. There's a 5 second
timeout for waiting for data from rrdcached. If some other behavior is
desirable, it's not hard to make work. And rrdcached isn't made a
requirement with these changes; if it's not configured, things will
just write to disk as they did before. If it can't create the socket /
connect / set non-blocking, right now, it errors out.

I can imagine that last point is undesirable, but I wasn't really sure
what else to do. If different behavior is wanted, it's easy to change.

This is currently working fine for our load, which ends up generating
/ updating several hundred thousand RRDs.

> And with regard to longer-term projects, I don't think "refactoring gmetad
> and gmond into a single process that shares memory" will work because gmond
> runs on lots of remote servers (unless I'm missing something here).

This might be specifically useful to our setup; I'd defer to Vlad on
that one. I just dug into the codebase; I don't actually know much
about using it :). It seems like some other data serialization format
(than XML) would be good at the very least.

> Regards,
> Nick

--dho

> On Fri, Dec 6, 2013 at 9:12 PM, <daniel.j.marr...@us.hsbc.com> wrote:
>>
>>
>> Very interesting read.  In fact we are struggling with similar issues of
>> scale within our environment.  I am keenly interested in the outcome of your
>> efforts and, if I was a stronger 'C' coder would offer my assistance.
>>
>> Daniel J Marrera
>> Global Unix Tooling & Support
>>
>>
>>
>> From: Vladimir Vuksan <vli...@veus.hr>
>> To: "ganglia-developers@lists.sourceforge.net"
>> <ganglia-developers@lists.sourceforge.net>
>> Date: 12/06/2013 02:37 PM
>> Subject: [Ganglia-developers] Gmetad bottlenecks
>>
>> ________________________________
>>
>>
>>
>> Hello everyone,
>>
>> For few weeks now we have had performance issues due to growth of our
>> monitoring setup. One of my colleagues Devon O'Dell volunteered to help and
>> below is an e-mail of his findings.
>>
>> We'll submit a pull request once we are comfortable with the changes
>>
>> https://github.com/dhobsd/monitor-core/compare/master
>>
>> Vladimir
>>
>> ==== Forwarded message ====
>>
>> Vlad emailed some time ago about issues we're having with Ganglia
>> performance. Over the past couple weeks, I spent some time figuring out how
>> Ganglia works and attempting to identify / solve the performance issues. The
>> issue is fundamentally one of scale: the number of metrics we monitor times
>> the number of servers we have (times the number of metrics we sum!) ends up
>> being a large number.
>>
>> The Ganglia core is comprised of two daemons, `gmond` and `gmetad`.
>> `Gmond` is primarily responsible for sending and receiving metrics; `gmetad`
>> carries the hefty task of summarizing / aggregating the information, writing
>> the metrics information to graphing utilities (such as RRD), and reporting
>> summary metrics to the web front-end. Due to growth some metrics were never
>> updating, and front-end response time was abysmal. These issues are tied
>> directly to `gmetad`.
>>
>> In our Ganglia setup, we run a `gmond` to collect data for every machine
>> and several `gmetad` processes:
>>
>>  * An interactive `gmetad` process is responsible solely for reporting
>> summary statistics to the web interface.
>>  * Another `gmetad` process is responsible for writing graphs.
>>
>> Initially, I spent a large amount of time using the `perf` utility to
>> attempt to find the bottleneck in the interactive `gmetad` service. I found
>> that the hash table implementation in `gmetad` leaves a lot to be desired:
>> apart from very poor behavior in the face of concurrency, it also is the
>> wrong datastructure to use for this purpose. Unfortunately, fixing this
>> would require rewriting large swaths of Ganglia, so this was punted.
>> Instead, Vlad suggested we simply reduce the number of summarized metrics by
>> explicitly stating which metrics are summarized.
>>
>> This improved the performance of the interactive process (and thus of the
>> web interface), but didn't address other issues: graphs still weren't
>> updating properly (or at all, in some cases). Running `perf` on the graphing
>> `gmetad` process revealed that the issue was largely one of serialization:
>> although we had thought we had configured `gmetad` to use `rrdcached` to
>> improve caching performance, the way that Ganglia calls librrd doesn't
>> actually end up using rrdcached -- `gmetad` was writing directly to disk
>> every time, forcing us to spin in the kernel. Additionally, librrd isn't
>> thread-safe (and its thread-safe API is broken). All calls to the RRD API
>> are serialized, and each call to create or update not only hit disk, but
>> prevented any other thread from calling create or update. We have 47 threads
>> running at any time, all generally trying to write data to an RRD file.
>>
>> Modifying gmetad to call the proper librrd function to call into rrdcached
>> helped a little, but most threads were still spending all their time
>> spinning on locks: although we were writing to rrdcached now, we were doing
>> so over a single file descriptor to a unix domain socket. This forced the
>> kernel to serialize the reads and writes from all the different threads to
>> the single file descriptor. The only reason we gained any performance was
>> due to not hitting disk.
>>
>> To solve this problem, I made rrdcached listen on a TCP socket and gave
>> every thread in gmetad its own file descriptor to connect to rrdcached. This
>> allowed every thread to write to rrdcached without locking for updates
>> (creating new RRDs still requires holding a lock and calling into librrd).
>> This worked, and I suspect that we'll be able to move forward for some time
>> with these changes. They are running on (censored) right now, and we'll
>> leave them running for a while to make sure they're good before pushing the
>> patches upstream.
>>
>> In the process of doing this, I noticed that ganglia used a particularly
>> poor method for reading its XML metrics from gmond: It initialized a
>> 1024-byte buffer, read into it, and if it would overflow, it would realloc
>> the buffer with an additional 1024 bytes and try reading again. When dealing
>> with XML files many megabytes in size, this caused many unnecessary
>> reallocations. I modified this code to start with a 128KB buffer and double
>> the buffer size when it runs out of space. (I made a similar change to the
>> code for decompressing gzip'ed data that used a similar buffer sizing
>> paradigm).
>>
>> After all these changes, both the interactive and RRD-writing processes
>> spend most of their time in the hash table. I can continue improving Ganglia
>> performance, but most of the low hanging fruit is now gone; at some me point
>> it will require:
>>
>>  * writing a version of librrd (this probably also means changing the rrd
>> file format),
>>  * replacing the hash table in Ganglia with one that performs better,
>>  * changing the data serialization format from XML to one that is easier /
>> faster to parse,
>>  * using a different data structure than a hash table for metrics
>> hierarchies (probably a tree with metrics stored at each level in contiguous
>> memory and an index describing each metric at each level)
>>  * refactoring gmetad and gmond into a single process that shares memory
>>
>> These are all longer-term projects, but I think that they'll probably
>> eventually be useful.
>>
>> ________________________________
>>
>> ******************************************************************
>> This message originated from the Internet. Its originator may or
>> may not be who they claim to be and the information contained in
>> the message and any attachments may or may not be accurate.
>>
>> ******************************************************************------------------------------------------------------------------------------
>>
>>
>> Sponsored by Intel(R) XDK
>> Develop, test and display web and hybrid apps with a single code base.
>> Download it for free now!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
>>
>> -----------------------------------------
>> ******************************************************************
>> This message originated from the Internet.  Its originator may or
>> may not be who they claim to be and the information contained in
>> the message and any attachments may or may not be accurate.
>>
>> ******************************************************************_______________________________________________
>> Ganglia-developers mailing list
>> Ganglia-developers@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/ganglia-developers
>>
>>
>> -----------------------------------------
>> ******************************************************************
>> This message originated from the Internet.  Its originator may or
>> may not be who they claim to be and the information contained in
>> the message and any attachments may or may not be accurate.
>> ******************************************************************
>>
>> -----------------------------------------
>> ****************************************************************** This
>> E-mail is confidential. It may also be legally privileged. If you are not
>> the addressee you may not copy, forward, disclose or use any part of it. If
>> you have received this message in error, please delete it and all copies
>> from your system and notify the sender immediately by return E-mail.
>> Internet communications cannot be guaranteed to be timely, secure, error or
>> virus-free. The sender does not accept liability for any errors or
>> omissions.
>> ****************************************************************** SAVE
>> PAPER - THINK BEFORE YOU PRINT!
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Sponsored by Intel(R) XDK
>> Develop, test and display web and hybrid apps with a single code base.
>> Download it for free now!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Ganglia-developers mailing list
>> Ganglia-developers@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/ganglia-developers
>>
>
>
>
> --
> gpg: using PGP trust model
> pub   4096R/1EE38BD9 2013-01-06 [expires: 2018-01-06]
>       Key fingerprint = 3EE9 550D D9D8 DB65 58C2  B58D CE78 EC6C 1EE3 8BD9
> uid                  Nicholas Satterly (Debian Key) <nfsatte...@gmail.com>
> sub   4096R/23804EE9 2013-01-06 [expires: 2018-01-06]
>
>
> ------------------------------------------------------------------------------
> Sponsored by Intel(R) XDK
> Develop, test and display web and hybrid apps with a single code base.
> Download it for free now!
> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
> _______________________________________________
> Ganglia-developers mailing list
> Ganglia-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/ganglia-developers
>

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Re: [Ganglia-developers] Gmetad bottlenecks

Reply via email to