On Sunday, 21 February 2016 at 14:32:15 UTC, dextorious wrote:
I've been vaguely aware of D for many years, but the recent
addition of std.experimental.ndslice finally inspired me to
give it a try, since my main expertise lies in the domain of
scientific computing and I primarily use Python/Julia/C++,
where multidimensional arrays can be handled with a great deal
of expressiveness and flexibility. Before writing anything
serious, I wanted to get a sense for the kind of code I would
have to write to get the best performance for numerical
calculations, so I wrote a trivial summation benchmark. The
following code gave me slightly surprising results:
import std.stdio;
import std.array : array;
import std.algorithm;
import std.datetime;
import std.range;
import std.experimental.ndslice;
void main() {
int N = 1000;
int Q = 20;
int times = 1_000;
double[] res1 = uninitializedArray!(double[])(N);
double[] res2 = uninitializedArray!(double[])(N);
double[] res3 = uninitializedArray!(double[])(N);
auto f = iota(0.0, 1.0, 1.0 / Q / N).sliced(N, Q);
StopWatch sw;
double t0, t1, t2;
sw.start();
foreach (unused; 0..times) {
for (int i=0; i<N; ++i) {
res1[i] = sumtest1(f[i]);
}
}
sw.stop();
t0 = sw.peek().msecs;
sw.reset();
sw.start();
foreach (unused; 0..times) {
for (int i=0; i<N; ++i) {
res2[i] = sumtest2(f[i]);
}
}
sw.stop();
t1 = sw.peek().msecs;
sw.reset();
sw.start();
foreach (unused; 0..times) {
sumtest3(f, res3, N, Q);
}
t2 = sw.peek().msecs;
writeln(t0, " ms");
writeln(t1, " ms");
writeln(t2, " ms");
assert( res1 == res2 );
assert( res2 == res3 );
}
auto sumtest1(Range)(Range range) @safe pure nothrow @nogc {
return range.sum;
}
auto sumtest2(Range)(Range f) @safe pure nothrow @nogc {
double retval = 0.0;
foreach (double f_ ; f) {
retval += f_;
}
return retval;
}
auto sumtest3(Range)(Range f, double[] retval, double N, double
Q) @safe pure nothrow @nogc {
for (int i=0; i<N; ++i) {
for (int j=1; j<Q; ++j) {
retval[i] += f[i,j];
}
}
}
When I compiled it using dmd -release -inline -O -noboundscheck
../src/main.d, I got the following timings:
1268 ms
312 ms
271 ms
I had heard while reading up on the language that in D explicit
loops are generally frowned upon and not necessary for the
usual performance reasons. Nevertheless, the two explicit loop
functions gave me an improvement by a factor of 4+.
Furthermore, the difference between sumtest2 and sumtest3 seems
to indicate that function calls have a significant overhead. I
also tried using f.reduce!((a, b) => a + b) instead of f.sum in
sumtest1, but that yielded even worse performance. I did not
try the GDC/LDC compilers yet, since they don't seem to be up
to date on the standard library and don't include the ndslice
package last I checked.
Now, seeing as how my experience writing D is literally a few
hours, is there anything I did blatantly wrong? Did I miss any
optimizations? Most importantly, can the elegant operator
chaining style be generally made as fast as the explicit loops
we've all been writing for decades?
The problem is not with ranges, but with the particualr algorithm
used for summing. If you look at the docs
(http://dlang.org/phobos-prerelease/std_algorithm_iteration.html#.sum) you'll see that if the range has random-access `sum` will use the pair-wise algorithm. About the second and third tests, the problem is with DMD which should not be used when measuring performance (but only for development, because it has fast compile-times).
These are the results that I get with LDC:
Pair-wise (sumtest1):
415 ms
21 ms
20 ms
And if I use the Kahan algorithm:
106 ms
36 ms
31 ms
The second two results are probably larger due to noise.
And if I increase N to 100_000:
Pair-wise (sumtest1):
29557 ms
2061 ms
1990 ms
Kahan:
4566 ms
2067 ms
1990 ms
According to `dub --verbose`, my command-line was roughly this:
ldc2 -ofapp -release -O5 -singleobj -w source/app.d
../../../../.dub/packages/mir-0.10.1-alpha/source/mir/ndslice/internal.d
../../../../.dub/packages/mir-0.10.1-alpha/source/mir/ndslice/iteration.d
../../../../.dub/packages/mir-0.10.1-alpha/source/mir/ndslice/package.d
../../../../.dub/packages/mir-0.10.1-alpha/source/mir/ndslice/selection.d
../../../../.dub/packages/mir-0.10.1-alpha/source/mir/ndslice/slice.d