Re: Inherent code performance advantages of D over C?

H. S. Teoh Fri, 06 Dec 2013 15:43:07 -0800

On Fri, Dec 06, 2013 at 02:20:22PM -0800, Walter Bright wrote:
> 
> "there is no way proper C code can be slower than those languages."
> 
>   -- 
> http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy
> 
> comes up now and then. I think it's incorrect, D has many inherent
> advantages in generating code over C:
> 
> 1. D knows when data is immutable. C has to always make worst case
> assumptions, and assume indirectly accessed data mutates.


Does the compiler currently take advantage of this, e.g., in aliasing
analysis?


> 2. D knows when functions are pure. C has to make worst case assumptions.

Does the compiler currently take advantage of this?


> 3. Function inlining has generally been shown to be of tremendous
> value in optimization. D has access to all the source code in the
> program, or at least as much as you're willing to show it, and can
> inline across modules. C cannot inline functions unless they appear
> in the same module or in .h files. It's a rare practice to push many
> functions into .h files. Of course, there are now linkers that can
> do whole program optimization for C, but those are kind of herculean
> efforts to work around that C limitation of being able to see only
> one module at a time.

To be frank, dmd's current inlining capabilities are rather
disappointing. However, gdc IME has shown better track record in that
area, so it looks promising.


> 4. C strings are 0-terminated, D strings have a length property. The
> former has major negative performance consequences:
> 
>     a. lots of strlen()'s are necessary

Yeah, this is a common C bottleneck that most obsessively-optimizing C
coders overlook. Many a time profiling of my "optimal" code turned up
surprising bottlenecks, like fprintf's in the wrong place, or strlen's
in inner loops. D's strings may be "heavier" (in the sense that they
require a length field as opposed to a mere pointer -- y'know, the kind
of stuff obsessively-optimizing C coders lose sleep over), but it comes
with so many advantages that it's more than worth the price.


>     b. using substrings usually requires a malloc/copy/free sequence

Yeah, string manipulation in C (and to a great extent C++) is a royal
pain. It actually drove me to use Perl for non-performance critical
string manipulating code for quite a good number of years. The heavy
syntax required for using a regex library in C, the memory management
nightmare of keeping track of substrings, all contribute to fragile
code, increased development times, and poorer performance. Only
obsessively-optimizing C coders can fail to see this (and I used to be
among them). ;-)


> 5. CTFE can push a lot of computation to compile time rather than
> run time. This has had spectacular positive performance consequences
> for things like regex. C has no CTFE ability.

D's compile-time capabilities were one of the major factors in my
adoption of D. The fact that it allows one to write regex code with nice
syntax yet zero runtime overhead is awesome.

One of my favorite demonstrations of D's compile-time capabilities is in
TDPL, where Andrei described a PRNG generator that checks against poor
PRNG parameters at compile-time, and stops compilation if the checks
fail. So if it compiles, you're guaranteed the PRNG is good. In C? Well
you google for prime numbers and stick randomly-chosen ones into your
PRNG parameters, and cross your fingers and hope for the best... (Or at
best, the code will do the checks at runtime and abort, but honestly,
*what* C coder would do such a thing?)


> 6. D's array slicing coupled with GC means that many
> malloc/copy/free's normally done in C are unnecessary in D.

It does mean we need to get our act together and improve the GC's
performance, though. ;-)

But this is a major advantage in array manipulation (esp. string
manipulation) in D. In equivalent C code, best practices dictate that
slicing should always be done by allocating a new array and copying,
because otherwise you introduce intricate dependencies between pointers
and your code either becomes wrong (leaks memory / has dangling
pointers), or extremely convoluted (reinvent the GC with reference
counting, etc.). Being able to freely slice any array you want without
needing to keep track of every single pointer, is a big savings in
development time, not to mention better runtime performance because you
don't have to keep allocating & copying. All that copying does add up!

There's also another aspect to this: being built into the language, D
slices allows different pieces of code to freely exchange them without
needing to worry about memory management. In equivalent C code, an
efficient implementation of slices would require the use of custom
structs (as a pair of pointers, or a pointer + length ala D), or
specific function parameter conventions (e.g. always pass pointer +
length). Since everyone will implement this in a slightly different way,
it makes interoperating different bits of code a pain. One function
takes a pointer + length, another function takes a pointer pair, a third
function takes a custom struct that encapsulates the foregoing, and a
fourth function takes *another* custom struct that does the same thing
but with a different implementation -- so now to make the code work
together, you have to insert intermediate layers of code to convert
between the representations. It's a big development time sink, and yet
another corner for bugs to hide in. In D, since slices are built-in,
everybody takes a slice, no interconversion is necessary, and everybody
is happy.


> 7. D's "final switch" enables more efficient switch code generation,
> because the default doesn't have to be considered.

Not to mention the maintenance advantages of final switch: modify the
enum, and the compiler automatically points you to all the places in the
code where the new enum value has to be handled. In C, you just modify
the enum, and everything still compiles, except now things don't quite
work until you add the new value to every switch statement that needs to
handle it. But it's easy to miss one or two places where this is needed,
so you have latent bugs that only show up through thorough testing. (And
everybody knows that C codebases always come with thorough unittests,
right? *ahem* More likely, these bugs go unnoticed until it blows up in
a customer's production environment, then you have to pull an
all-nighter to track down what's causing it.)

It's not surprising that this led to the belief that switch statements
are evil, as any OO aficionado would tell you. Well, *I* say that
they're evil only because C's implementation of them is so primitive. In
D, they can actually be a good thing!

(Of course, you can still screw yourself over in D by inserting empty
default cases into the wrong switch statements, but you have to actively
shoot yourself in the foot to do it, in which case it's really your own
fault, not the language's.)

//

And I'd also add 8.: most applications can get by with using a GC, and
this actually lets you write more efficient code.

One particular example I have in mind is a parse function that returns
an AST. In C, the AST nodes would be malloc'd, of course, and then
everyone downstream will have to make sure they call free on each and
every AST node in order to avoid memory leakage. In the first version of
the code, this is no biggie: just write a function called freeAST() or
some such, and make sure everyone calls it. But then you want some parts
of the code to hold on to bits of the AST -- say, store a function body
with its corresponding symbol table entry -- and now you have to either
recursively copy the AST (inefficient), or transplant it from the main
AST and make sure you NULL the pointers in the parent tree so that it
doesn't get double-freed, or some other such manual technique (very
error-prone). But what if *two* places need to hold on to the same AST
subtree? Conventional C wisdom says, duplicate it, since otherwise
you'll have a major headache later on trying to keep track of when you
can free something. Or reinvent the GC with reference-counting, etc.,
which you'll eventually have to once parts of the trees start
cross-referencing each other.

And later yet, you add error-handling, and now you have to make sure
every time an error occurs, you call freeAST() on partially-constructed
AST subtrees, and soon your initially simple code turns into a spaghetti
of "if(error) goto cleanupAST". Miss just *one* of these cases, and
you've a memory leak.

In D? Just allocate the nodes and return them freely, and let them
cross-reference them as much as they like -- the GC will clean up after
you. Faster to code, less error-prone, and more performant (avoids
duplicating AST subtrees, avoids complicated bookkeeping to keep track
of when a node can be freed, etc.). Sure the GC will have to pause your
program to run its collection cycles -- but that's not much different
from a recursive free() of an AST subtree, which doesn't guarantee an
upper-bound on running time (if your tree is 1 million nodes, then it
has to do 1 million free's, right then, right there, and your program's
gonna wait for that to finish, no matter what -- the thought that manual
memory management is pause-free is a myth).

It's true that manual memory management targeted for your specific
application will always outperform a generic GC, since it can take
advantage of your application's specific usage patterns. But 95% of the
time, you don't *need* this kind of tweaking; yet in the C world, you
have no choice but to manually manage everything. And chances are, your
ad hoc implementation will perform poorer than the GC. After all, not
everybody is an expert at writing memory management code. And, in a team
project, expect people to screw up and introduce poor performance and/or
pointer bugs, even if you've tweaked the thing to perfection. Besides,
it's so much effort for only a little gain. C coders who have never used
a GC'd language don't know what they're missing, when it comes to ease
of dealing with complex cross-referencing data structures.


T

-- 
Indifference will certainly be the downfall of mankind, but who cares? -- 
Miquel van Smoorenburg

Re: Inherent code performance advantages of D over C?

Reply via email to