On 9/21/12 5:39 AM, Jacob Carlborg wrote:
On 2012-09-21 06:23, Andrei Alexandrescu wrote:

For a very simple reason: unless the algorithm under benchmark is very
long-running, max is completely useless, and it ruins average as well.

I may have completely misunderstood this but aren't we talking about
what do include in the output of the benchmark? In that case, if you
don't like max and average just don't look at it.

I disagree. I won't include something in my design just so people don't look at it most of the time. Min and average are most of the time an awful thing to include, and will throw off people with bizarre results.

If it's there, it's worth looking at. Note how all columns are directly comparable (I might add, unlike other approaches to benchmarking).

For virtually all benchmarks I've run, the distribution of timings is a
half-Gaussian very concentrated around the minimum. Say you have a
minimum of e.g. 73 us. Then there would be a lot of results close to
that; the mode of the distribution would be very close, e.g. 75 us, and
the more measurements you take, the closer the mode is to the minimum.
Then you have a few timings up to e.g. 90 us. And finally you will
inevitably have a few outliers at some milliseconds. Those are orders of
magnitude larger than anything of interest and are caused by system
interrupts that happened to fall in the middle of the measurement.

Taking those into consideration and computing the average with those
outliers simply brings useless noise into the measurement process.

After your replay to one of Manu's post, I think I misunderstood the
std.benchmark module. I was thinking more of profiling. But are these
quite similar tasks, couldn't std.benchmark work for both?

This is an interesting idea. It would delay release quite a bit because I'd need to design and implement things like performance counters and such.


Andrei

Reply via email to