On 4/8/12 3:03 PM, Manfred Nowak wrote:
Andrei Alexandrescu wrote:
Clearly there is noise during normal use as well, but
incorporating it in benchmarks as a matter of course reduces the
usefulness of benchmarks
On the contrary:
1) The "noise during normal use" has to be measured in order to detect
the sensibility of the benchmarked program to that noise.
That sounds quite tenuous to me. How do you measure it, and what
conclusions do you draw other than there's a more or less other stuff
going on on the machine, and the machine itself has complex interactions?
Far as I can tell a time measurement result is:
T = A + Q + N
where:
A > 0 is actual benchmark time
Q > 0 quantization noise (uniform distribution)
N > 0 various other noises (interrupts, task switching, networking, CPU
dynamically changing frequency, etc). Many people jump on Gaussian as an
approximation, but my tests suggest it's hardly so because it has a lot
of jerky outliers.
How do we estimate A given T?
2) The noise the benchmarked program produces has to be measured too,
because the running benchmarked program probably increases the noise
for all other running programs.
How to measure that? Also, that noise does not need to be measured as
much as eliminated to the extent possible. This is because the benchmark
app noise is a poor model of the application-induced noise.
In addition: the noise produced by a machine under heavy load might
bring the performance of the benchmarked program down to zero.
Of course. That's why the documentation emphasizes the necessity of
baselines. A measurement without baselines is irrelevant.
Andrei