On 3/23/12 3:02 AM, Juan Manuel Cabo wrote:
On Friday, 23 March 2012 at 05:16:20 UTC, Andrei Alexandrescu wrote:
[.....]
(man, the gaussian curve is everywhere, it never ceases to
perplex me).
I'm actually surprised. I'm working on benchmarking lately and the
distributions I get are very concentrated around the minimum.
Andrei
Well, the shape of the curve depends a lot on
how the random noise gets inside the measurement.
[snip]
Hmm, well the way I see it, the observed measurements have the following
composition:
X = T + Q + N
where T > 0 (a constant) is the "real" time taken by the processing, Q >
0 is the quantization noise caused by the limited resolution of the
clock (can be considered 0 if the resolution is much smaller than the
actual time), and N is noise caused by a variety of factors (other
processes, throttling, interrupts, networking, memory hierarchy effects,
and many more). The challenge is estimating T given a bunch of X samples.
N can be probably approximated to a Gaussian, although for short timings
I noticed it's more like bursts that just cause outliers. But note that
N is always positive (therefore not 100% Gaussian), i.e. there's no way
to insert some noise that makes the code seem artificially faster. It's
all additive.
Taking the mode of the distribution will estimate T + mode(N), which is
informative because after all there's no way to eliminate noise.
However, if the focus is improving T, we want an estimate as close to T
as possible. In the limit, taking the minimum over infinitely many
measurements of X would yield T.
Andrei