On 4/8/12 3:03 PM, Manfred Nowak wrote:
Andrei Alexandrescu wrote:

Clearly there is noise during normal use as well, but
incorporating it in benchmarks as a matter of course reduces the
usefulness of benchmarks

On the contrary:
1) The "noise during normal use" has to be measured in order to detect
the sensibility of the benchmarked program to that noise.

That sounds quite tenuous to me. How do you measure it, and what conclusions do you draw other than there's a more or less other stuff going on on the machine, and the machine itself has complex interactions?

Far as I can tell a time measurement result is:

T = A + Q + N

where:

A > 0 is actual benchmark time

Q > 0 quantization noise (uniform distribution)

N > 0 various other noises (interrupts, task switching, networking, CPU dynamically changing frequency, etc). Many people jump on Gaussian as an approximation, but my tests suggest it's hardly so because it has a lot of jerky outliers.

How do we estimate A given T?

2) The noise the benchmarked program produces has to be measured too,
because the running benchmarked program probably increases the noise
for all other running programs.

How to measure that? Also, that noise does not need to be measured as much as eliminated to the extent possible. This is because the benchmark app noise is a poor model of the application-induced noise.

In addition: the noise produced by a machine under heavy load might
bring the performance of the benchmarked program down to zero.

Of course. That's why the documentation emphasizes the necessity of baselines. A measurement without baselines is irrelevant.


Andrei

Reply via email to