On 10/3/11 8:59 AM, Jens Mueller wrote:
[snip]

Thanks for this rich feedback. I'll act on most or all points. A few notes follow.

1. I'm not that convinced that prepending benchmark_ to a function is
that nice even though it's simple. But following this idea I could
remove all my unittest blocks and add functions starting with test_ and
write a module std.test in same fashion. If this approach is considered
the way to go we can remove unittest altogether and solve testing
entirely in a library. After all it looks less elegant to me but I admit
that I see no other way and it is the easiest solution after all. I just
want to stress that when seeing the code I questioned myself whether
having unittest in the language is nice to have.

std.benchmark uses relatively new reflection features that weren't available until recently. Adding unittest now would be difficult to justify, but at the time unittest was a good feature.

2. When benchmarking code you often investigate the algorithm's
properties regarding its scalability.

Yah, that definitely needs to be looked at. I think I need to add some support beyond mere documentation.

3.
(r[i] / 10_000_000.).to!("nsecs", int)
should be written as
(r[i] / 10_000_000).nsecs

Same for msecs and similar.

Noted.

4. Test results should be exposed to the caller.
Usually, it is enough to write the results to the console. But you may
want to post process the results. Being it either graphically or to
output XML for whatever reason.
I think it would be beneficial to add this flexibility and separate the
benchmarking logic from its generating the output.
It might be useful to add the used CPU using std.cpuid to the output.

We should decouple collection from formatting. However, this complicates matters because we'd need to expose the benchmark results structure. I'll think of it.

5. In the first example I completely missed that it is not "relative
calls". I thought this was one term even though I didn't know what it
should mean. Later I figured that there is an empty column called
relative.

There are two spaces there, I should add three.

6. Wrong numbers in documentation
The examples in the documentation have wrong numbers. In the end one
should pay special attention that the documentation is consistent. I
think one cannot automate this.

Not sure how this can be addressed at all (or maybe I don't understand). The examples show real numbers collected from real runs on my laptop. I don't think one may expect to reproduce them with reasonable accuracy because they depend on a lot of imponderables.

There are also rounding errors. That means when I try to compute these
numbers I get different in the first digit after the decimal point. I
wonder why this is. Even though the measurement may be noisy the
computed numbers should be more precise I think.

Where are calculations mistaken? Indeed I see e.g. that append built-in has 83ns/call and 11.96M calls/s. Reverting the first with a "calculator" program yields 12.04 Mcalls/s, and reverting the second yields 83.61ns/call. I'll see what can be done.

7. Changing units in the output.
One benchmark report may use different units in its output. The one in
the example uses microseconds and nanoseconds. This makes comparison
more difficult and should be avoided (at least for groups). It is
error-prone when analyzing the numbers.

Problem is space and wildly varying numbers. I initially had exponential notation for everything, but it was super ugly and actually more difficult to navigate. That's why I added the compact notation. I think we'll need to put up with it.

8. I prefer writing std.benchmark.pause() over benchmarkPause() and
std.benchmark.resume() over benchmarkResume().

OK.

10. The file reading/writing benchmark should be improved. Since the
file should be deleted. The file writing benchmark should be more like:
void benchmark_fileWrite()
{
     benchmarkPause();
     auto f = File.tmpfile();
     benchmarkResume();

     f.writeln("hello, world!");
}

But this also measures the deletion of the file.

If you don't want to measure std.file.write but instead only the time spent in writeln, his should work and measure only the time spent in writing to the file:

void benchmark_fileWrite()
{
     benchmarkPause();
     auto f = File.tmpfile();
     benchmarkResume();
     f.writeln("hello, world!");
     benchmarkPause();
}

I haven't yet tested the case when a functions leaves the benchmark paused.

I have no idea how to fix the benchmark for reading a file. But I
believe it involves using mkstemp on POSIX and similar on Windows.
mkstemp should be added to std.file in the long run.

Why is there a need to fix it? The benchmark measures the performance of std.file.read soup to nuts.

In related news, I got word from a colleagues that I should also use times() (http://linux.die.net/man/2/times) to distinguish time spent in user space vs. system calls. This is because some functions must issue certain system calls in any case and the user is interested in how much they can improve the situation by optimizing their end of the deal. This would apply e.g. to file I/O where the overhead of formatting etc. would become easier to distinguish.


Andrei

Reply via email to