Re: [rrd-developers] Unexpected behaviour of PREDICT and PREDICTSIGMA

Martin Sperl Sat, 26 Apr 2014 03:13:17 -0700

Yes it is (somewhat) intended, because you would possibly also want to take the 
"current" value and its window (so the last w seconds) into account for the 
estimation as well.
Obviously this selection has more of an impact on sigma calculation than the 
prediction itself - especially when the current value is "outside" of expected 
behavior...

That is why I have provided the long format of defining the offsets of your 
choice as absolute offsets (which also allows you to give some days - say 7 
days ago - higher weight than others...)

The negative option was there more of a short cut for the 0, -1*s, -2*s, -n*s 
which was my focus.

But as for your "intuitive" argument - I would at least want to include -7*s 
when I say n=-7 not only -6*s days, so this is an off by 1 bug...
But, I still would include the "current" window as part of the sample - 
especially for the sigma part (but if you do it, then you need to do it for 
both the same way...)

You still can implement the behavior of your choice using the explicit 
approach: 
CDEF:y=-604800,-604800,-518400,-432000,-345600,-259200,-172800,-86400,8,1800,x,PREDICT

(here with the same day a week ago with a higher weight by including it twice, 
and no influence from the value of "now")

Ciao, Martin

P.s: For our practical implementations we have switched to the explicit 
approach (as it allows you to give higher weights to the same day a week ago...)
So from our use-case we are not really affected by any change in this code...

As I think about it - under some circumstances a PREDICT_MEDIAN would also be a 
good thing to use instead of calculating the average with PREDICT.

Obviously MEDIAN and SIGMA do not mix well, but that way a "one-off" (say an 
issue one day last week) has less influence over the prediction than the 
PREDICT_AVERAGE version.
But instead it favours older data for any increasing/decreasing trends so it 
more likley over/underestimates - especially if you do predictions with data 
where your values increase/decrease by a factor of 2 in 6 month with PREDICTION 
of 6 steps of say 30 days and a window of 7 days.

To get some upper/lower boundry graphs and alerting based on those, then the 
equivalent to PREDICT+X*SIGMA,PREDICT,PREDICT-X*SIGMA would be the calculation 
of "percentiles" - so 
PREDICT_PERCENTILE(75),PREDICT_MEDIAN,PREDICT_PERCENTILE(25).
(note that PREDICT_MEDIAN = PREDICT_PERCENTILE(50), which would make the code 
easier, but maybe with PREDICT_MEDIAN as an alias/short...)

Note that the required sorting of values for MEDIAN/PERCENTILE is even more CPU 
consuming than the calculation of averages...
On 26.04.2014, at 11:02, Steve Shipway <[email protected]> wrote:

> So, assuming stptr is zero based, this means my `analysis of the behaviour of 
> negative shift counts is correct.
> 
> My question, though, is more to ask if this was intended by design, or if it 
> is a 'feature'?
> 
> Intuitively, I would have expected this:
> 
> CDEF:y=s,-n,x,PREDICT
> 
> to result in shifts of s, 2s, 3s, .... ns
> 
> However, it actually results in shifts of 0, s, 2s, ... (n-1)s
> 
> Either way, it can't be changed now as it would potentially alter existing 
> behaviour... 
> 
> Steve
> 
> Steve Shipway
> University of Auckland ITS
> UNIX Systems Design Lead
> [email protected]
> Ph: +64 9 373 7599 ext 86487
> 
> 
> _______________________________________________
> rrd-developers mailing list
> [email protected]
> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers

_______________________________________________
rrd-developers mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers

Re: [rrd-developers] Unexpected behaviour of PREDICT and PREDICTSIGMA

Reply via email to