On Mon, 14 Jan 2013, Conrad Wood wrote:

On Mon, 2013-01-14 at 00:05 -0800, David Lang wrote:
On Mon, 14 Jan 2013, Conrad Wood wrote:

I am looking for a solution to gather, correlate and graph large sets of
data.

The data is gathered from several hundred servers (going upto multi-1000
within 12 month).
Requirements are:
* 1 minute resolution of data
* individual graphs of a single value stream (e.g. IOPS for a given
block device)
* Graphs over a bunch of value streams (e.g. max IOPS per minute on
100ish block devices together)
* Data Retention in this resolution for 90 days

I looked at:
Collectd - missing data correlation and probably cannot handle large
amounts of data

Graphite/Carbon - missing data correlation

RRD - it averages data, which is *NOT* what I want. I need the absolute
MAX values in 1-Minute intervals

RRD can be configured to keep data is whatever resolution you want for whatever
time period you want. By default is averages things, but I've configured RRD
databases to have 1 min intervals and to keep that resoution for longer time
periods.

Whenyou create a RRD datastore, you tell it what time periods to use, and how
long to keep them for each time period. You also tell is if you want to store
MAX as well as average values for that series of data points.


Thanks for the swift response.
If I calculate 300 Servers, each providing data of approx 30 64bit
values in 1 minute intervals I get 9000 Values per minute. That is
12,960,000 values per day.
so for 90 days it is 1,166,400,000 values, I need to store.
I don't imagine RRD being particular efficient at that.

It's actually likely to be more efficient than you are expecting. RRD knows that it's dealing with a circular buffer of a known size, that allows for a LOT of optimizations.

Nothing is magic, but as long as you have enough ram to keep all your RRD writes in ram (and I've heard of people using RRD with 100's of thousands of values being tracked, RAM is large nowdays), it's not bad to store them.

Further more with RRD datastore I stick one stream of values in one
database and have no way of correlating them.
Or am I mistaken here?

Well, if you know ahead of time what you want to look at, create a separate RRD database that holds that data. Otherwise you do have to lookup all the data points and add them together when you graph them, but this is true with any datastore.

Since everything is stored by time, correlating them is easy.

David Lang
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/

Reply via email to