Question #193502 on Graphite changed:
https://answers.launchpad.net/graphite/+question/193502

    Status: Open => Answered

Michael Leinartas proposed the following answer:
So you're finding out that the particulars of how aggregation works in
the whisper database are a bit wonky..

I'm looking at the first example primarily right now. To start, you are
going about inspecting the retentions in the correct way, but are a
little bit off in the time. Whisper will return data from the highest
precision archive (retention definition) that will satisfy the entire
period specified. Requesting 3 seconds of data to verify the 1s:3s
archive is correct, however Whisper does everything relative to the
current time so you instead want to use $(($(date +%s) - 3)) to get the
first archive - you should get 3 points in that case returned, all with
a value of 1. You can also update several points at once with whisper-
update.py so that it happens quicker (before the current second rolls
over). Finally, I've noticed that aggregation behaves unexpectedly when
there aren't enough points in the first archive to satisfy the 2nd
archive (you found a weird edge case). The minimum retention you should
use in this case is 1s:5s. Here's a slightly modified script:

Script 1 modified
=======================================
#!/bin/bash

rm -f test.wsp
whisper-create.py --xFilesFactor=0 --aggregationMethod=sum test.wsp 1s:5s 5s:20s
CREATED=$(date +%s)
echo "Created: $CREATED"
whisper-update.py test.wsp $(($(date +%s))):1 $(($(date +%s)-1)):1 $(($(date 
+%s)-2)):1 $(($(date +%s)-3)):1 $(($(date +%s)-4)):1
echo
echo Using 1s resolution:
whisper-fetch.py --from=$(($(date +%s)-5)) test.wsp
echo

echo Using 5s resolution:
whisper-fetch.py --from=$(($(date +%s)-30)) test.wsp

Output from modified script 1:
=======================================
Created: test.wsp (148 bytes)
Created: 1334621059
[('1334621059', '1'), ('1334621058', '1'), ('1334621057', '1'), ('1334621056', 
'1'), ('1334621055', '1')]

Using 1s resolution:
1334621055      1.000000
1334621056      1.000000
1334621057      1.000000
1334621058      1.000000
1334621059      1.000000

Using 5s resolution:
1334621040      None
1334621045      None
1334621050      None
1334621055      5.000000

This should look like you expect. The 5 points in the first archive are
aggregated into the 1334621055 bucket as a sum. Running it multiple
times will show that sometimes those 5 points will end up in a single
bucket and sometimes they'll be split between two (depending on what
second it's run on).


The 2nd script isn't doing what you expect because whisper-resize.py is 'dumb.' 
It iterates through the archives in reverse order (lowest resolution and 
longest retention to highest resolution and shortest retention), pulls the data 
out of each, and writes it to a new archive. It's best suited for simple 
resizes - extending a whisper file to cover a longer period at the lowest 
resolution for example.

Aggregation happens at storage time. Once a point is stored in an
archive (starting with the highest resolution archive), each lower
archive will read all of the points from the higher archive, aggregate
them, and store them. When you store points beyond the first archive in
age (through a resize or explicit storage) this propagation doesn't
happen. Instead, it's writing into the same bucket several times and
overwriting the last one each time.


What you'll need to do is to pre-aggregate your historical data for 
back-loading. Generally you'll work on getting the data sent to carbon and 
worry about back-loading later. That way you can also only worry about 
aggregating for your lowest precision archive (the 1d:730) if you wait a week 
for live data to load up.

Hope this helps

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.

_______________________________________________
Mailing list: https://launchpad.net/~graphite-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~graphite-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to