Re: [Ganglia-general] File I/O bottleneck (Conflicting gmetad versions?)
I think that's my problem. Someone else setup the first 2 servers so I didn't know rrdcached existed. Thanks, Aaron On Fri, Jun 19, 2015 at 10:50 AM, Vladimir Vuksan wrote: > Are you using rrdcached ? That would be my first recommendation if you are > running into I/O issues. > > Vladimir > > > 06/18/2015 u 06:26 PM, Aaron Thomas Holt je napisao/la: > >> Hello all, >> I have 2 central servers collecting data for ~1500 nodes on a multicast >> setup. These servers are able to handle the load no problem. These two >> servers unicast to a third server (which ideally contains a copy of all the >> rrd's on the first two servers). Unfortunately the third server is unable >> to keep up and I've found is due to the disk I/O. Server 3 has better >> hardware specs than the first 2 servers, so that shouldn't be the problem. >> >> Some information: >> Servers 1&2 are running gmetad 3.6.0. Running strace on these servers >> reveals that the rrd files are being left open and written to. >> Server 3 is running gmetad 3.7.1. After running strace on this server I >> found that every time an update to an rrd file is made the rrd is opened, >> then written to, then closed. >> >> I highly suspect the disk I/O bottleneck is due to the rrd's being >> opened/closed every time a write is needed. Any ideas on why this is >> happening and how I can fix it? >> >> > -- ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] File I/O bottleneck (Conflicting gmetad versions?)
Are you using rrdcached ? That would be my first recommendation if you are running into I/O issues. Vladimir 06/18/2015 u 06:26 PM, Aaron Thomas Holt je napisao/la: > Hello all, > I have 2 central servers collecting data for ~1500 nodes on a > multicast setup. These servers are able to handle the load no problem. > These two servers unicast to a third server (which ideally contains a > copy of all the rrd's on the first two servers). Unfortunately the > third server is unable to keep up and I've found is due to the disk > I/O. Server 3 has better hardware specs than the first 2 servers, so > that shouldn't be the problem. > > Some information: > Servers 1&2 are running gmetad 3.6.0. Running strace on these servers > reveals that the rrd files are being left open and written to. > Server 3 is running gmetad 3.7.1. After running strace on this server > I found that every time an update to an rrd file is made the rrd is > opened, then written to, then closed. > > I highly suspect the disk I/O bottleneck is due to the rrd's being > opened/closed every time a write is needed. Any ideas on why this is > happening and how I can fix it? > -- ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
[Ganglia-general] File I/O bottleneck (Conflicting gmetad versions?)
Hello all, I have 2 central servers collecting data for ~1500 nodes on a multicast setup. These servers are able to handle the load no problem. These two servers unicast to a third server (which ideally contains a copy of all the rrd's on the first two servers). Unfortunately the third server is unable to keep up and I've found is due to the disk I/O. Server 3 has better hardware specs than the first 2 servers, so that shouldn't be the problem. Some information: Servers 1&2 are running gmetad 3.6.0. Running strace on these servers reveals that the rrd files are being left open and written to. Server 3 is running gmetad 3.7.1. After running strace on this server I found that every time an update to an rrd file is made the rrd is opened, then written to, then closed. I highly suspect the disk I/O bottleneck is due to the rrd's being opened/closed every time a write is needed. Any ideas on why this is happening and how I can fix it? Thanks, Aaron -- ___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general