On Saturday, 7 December 2013 at 09:46:11 UTC, Joseph Rushton
Wakeling wrote:
On 07/12/13 09:14, Walter Bright wrote:
There are several D projects which show faster runs than C. If
your goal is to
pragmatically write faster D code than in C, you can do it
without too much
effort. If your goal is to find problem(s) with D, you can
certainly do that, too.
Well, as the author of a D library which outperforms the C
library that inspired it (at least within the limits of its
much smaller range of functionality; it's been a bit neglected
of late and needs more input) ...
... the practical experience I've had is that more than an
outright performance comparison, what it often comes down to is
effort vs. results, and the cleanliness/maintainability of the
resulting code. This is particularly true when it comes to C
code that is designed to be "safe", with all the resulting
boilerplate. It's typically possible to match or exceed the
performance of a C program with much more concise and easy to
follow D code.
Another factor that's important here is that C and D in general
seem to lead to different design solutions. Even if one has an
exact example in C to compare to, the natural thing to do in D
is often something different, and that leads to subtle and
not-so-subtle implementation differences that in turn affect
performance.
Example: in the C library that was my inspiration, there's a
function which requires the user to pass a buffer, to which it
writes a certain set of values which are calculated from the
underlying data. I didn't much like the idea of compelling the
user to pass a buffer, so when I wrote my D equivalent I used
stuff from std.range and std.algorithm to make the function
return a lazily-evaluated range that would offer the same
values as the C code stored in the buffer array.
I assumed this might lead to a small overall performance hit
because the C program could just write once to a buffer and
re-use the buffer, whereas I might be lazily calculating and
re-calculating. Unfortunately it turned out that for whatever
reason, my lazily-calculated range was somehow responsible for
lots of micro allocations, which slowed things down a lot. (I
tried it out again earlier this morning, just to refresh my
memory, and it looks like this may no longer be the case; so
perhaps something has been fixed here...)
So, that in turn led me to another solution again, where
instead of an external buffer being passed in, I created an
internal cache which could be written to once and re-used again
and again and again, never needing to recalculate unless the
internal data was changed.
Now, _that_ turned out to be significantly faster than the C
program, which was almost certainly doing unnecessary
recalculation of the buffer -- because it recalculated every
time the function was called, whereas my program could rely on
the cache, calculate once, and after that just return the slice
of calculated values. On the other hand, if I tweaked the
internals of the function so that every call _always_ involved
recalculating and rewriting to the cache, it was slightly
slower than the C -- probably because now it was the C code
that was doing less recalculation, because code that was
calling the function was calling it once and then using the
buffer, rather than calling it multiple times.
TL;DR the point is that writing in D gave me the opportunity to
spend mental and programming time exploring these different
choices and focusing on algorithms and data structures, rather
than all the effort and extra LOC required to get a
_particular_ idea running in C. That's where the real edge
arises.
This is exactly how I see it too. Well said.