On 3/19/2011 10:16 PM, dsimcha wrote:
* Again: speed of e.g. parallel min/max vs. serial, pi computation etc.
on a usual machine?

I **STRONGLY** believe this does not belong in API documentation because
it's too machine specific, compiler specific, stack alignment specific,
etc. and almost any benchmark worth doing takes up more space than an
example should. Furthermore, anyone who wants to know this can easily
time it themselves. I have absolutely no intention of including this.
While in general I appreciate and have tried to accommodate your
suggestions, this is one I'll be standing firm on.

If scalability information is present in however a non-committal form,
then people would be compelled ("ok, so this shape of the loop would
actually scale linearly with CPUs... neat").

Ok, I thought you were asking for something much more rigorous than
this. I therefore didn't want to provide it because I figured that, no
matter what I did, someone would be able to say that the benchmark is
flawed some how, yada, yada, yada. Given how inexact a science
benchmarking is, I'm still hesitant to put such results in API docs, but
I can see where you're coming from here.

I tried putting a few benchmarks in for the performance of parallel foreach, map and reduce. Basically, I just put the timings in for the examples on my dual core machine, and for an equivalent serial version that is not shown in the interest of brevity but is easily inferred. I didn't do this for the pipelining or future/promise primitives because these aren't as capable of automatically taking full advantage of as many cores as you can throw at them as map, reduce and foreach. This means that coming up with a benchmark that shows where a good speedup can be obtained is a much more difficult art.

Reply via email to