Re: Review of Andrei's std.benchmark

Dmitry Olshansky Fri, 21 Sep 2012 13:20:46 -0700

On 21-Sep-12 22:49, David Piepgrass wrote:

After extensive tests with a variety of aggregate functions, I can say
firmly that taking the minimum time is by far the best when it comes
to assessing the speed of a function.


As far as I know, D doesn't offer a sampling profiler, so one might
indeed use a benchmarking library as a (poor) substitute. So I'd want to
be able to set up some benchmarks that operate on realistic data, with
perhaps different data in different runs in order to learn about how the
speed varies with different inputs (if it varies a lot then I might
create more benchmarks to investigate which inputs are processed
quickly, and which slowly.)

Real good profilers are the ones served by CPU vendor. See AMD'sCodeAnalyst or Intel's VTune. They could even count number of branchpredictions, cache misses etc.It is certainly out of the charter of module or for that matter anystandard library code.


Some random comments about std.benchmark based on its documentation:

- It is very strange that the documentation of printBenchmarks uses
neither of the words "average" or "minimum", and doesn't say how many
trials are done.... I suppose the obvious interpretation is that it only
does one trial, but then we wouldn't be having this discussion about
averages and minimums right?


See the algorithm in action here:
https://github.com/D-Programming-Language/phobos/pull/794/files#L2R381

In other word a function is run 10^n times with n is picked so thattotal time is big enough to be a trustworthy measurement. Then run-timeis time/10^n.


Øivind says tests are run 1000 times...

The above 1000 times, picking the minimum as the best. Obviously it'd begood to be configurable.

but

it needs to be configurable per-test (my idea: support a _x1000 suffix
in function names, or _for1000ms to run the test for at least 1000
milliseconds; and allow a multiplier when when running a group of
benchmarks, e.g. a multiplier argument of 0.5 means to only run half as
many trials as usual.) Also, it is not clear from the documentation what
the single parameter to each benchmark is (define "iterations count".)

- The "benchmark_relative_" feature looks quite useful. I'm also happy
to see benchmarkSuspend() and benchmarkResume(), though
benchmarkSuspend() seems redundant in most cases: I'd like to just call
one function, say, benchmarkStart() to indicate "setup complete, please
start measuring time now."

- I'm glad that StopWatch can auto-start; but the documentation should
be clearer: does reset() stop the timer or just reset the time to zero?
does stop() followed by start() start from zero or does it keep the time
on the clock? I also think there should be a method that returns the
value of peek() and restarts the timer at the same time (perhaps stop()
and reset() should just return peek()?)


It's the same as the usual stopwatch (as in the real hardware thingy). Thus:
- reset just resets numbers to zeros
- stop just stops counting
- start just starts counting
- peek imitates taking a look at numbers on a device ;)


- After reading the documentation of comparingBenchmark and measureTime,
I have almost no idea what they do.

I think that comparingBenchmark was present in std.datetime and iscarried over as is.


--
Dmitry Olshansky

Re: Review of Andrei's std.benchmark

Reply via email to