On Mon, 12 Apr 2010 12:03:38 -0400, Joseph Wakeling <[email protected]> wrote:

I thought dev effort was now focusing back on GDC ... ? :-P

AFAIK, gdc hasn't been actively developed for a few years.

ldc, on the other hand, has regular releases. I think ldc may be the future of D compilers, but I currently use dmd since I'm using D2.


Steven Schveighoffer wrote:
The C++ example is reallocating memory, freeing memory it is no longer
using. It also manually handles the memory management, allocating larger and larger arrays in some algorithmically determined fashion (for example, multiplying the length by some constant factor). This gives it an edge in performance because it does not have to do any costly lookup to determine
if it can append in place, plus the realloc of the memory probably is
cheaper than the GC realloc of D.

Right.  In fact you get precisely 24 allocs/deallocs, each doubling the
memory reserve to give a total capacity of 2^23 -- and then that memory is there and can be used for the rest of the 100 iterations of the outer loop.
The shock for me was finding that D wasn't treating the memory like this
but was preserving each loop's memory (as you say, for good reason).

Yes, you get around this by preallocating.

D does not assume you stopped caring about the memory being pointed to
when it had to realloc. [...] You can't do the same thing with C++
vectors, when they reallocate, the memory they used to own could be
freed.  This invalidates all pointers and iterators into the vector,
but the language doesn't prevent you from having such dangling pointers.

I have a vague memory of trying to do something exactly like your example
when I was working with C++ for the first time, and getting bitten on the
arse by exactly the problem you describe.  I wish I could remember where.
I know that I found another (and possibly better) solution to do what I
wanted, but it would be nice to see if a D-ish solution would give me
something good.

It's often these types of performance discrepancies that critics point to (not that you are a critic), but it's the cost of having a more comprehensive language. Your appetite for the sheer performance of a language will sour once you get bit by a few of these nasty bugs.

But D fosters a completely different way of thinking about solving problems. One problem with C++'s vector is it is a value type -- you must pass a reference in order to avoid copying an entire vector. However, D's arrays are a hybrid between reference and value type. Often, once you set data in a vector/array, you never change it again. D allows ways to enforce this (i.e. immutable) and also allows you to pass around "slices" of your array with zero overhead (no copying). It results in some extremely high-performance code, which wouldn't be easy, or maybe even possible, with C++.

Take for instance a split function. In C++, I'd expect split(string x) to return a vector<string>. However, vector<string> makes a copy of each part of the string it has split out. D, however, can return references to the original data (slices), which consume no overhead. The only extra space allocated is the array to hold the string references. All this is also completely safe!

You could then even modify the original string (assuming you were not using immutable strings) in place! Or append to any one of the strings in the array safely.

This must be fixed, the appender should be blazingly fast at appending
(almost as fast as C++), with the drawback that the overhead is higher.

Overhead = memory cost?  I'm not so bothered as long as the memory stays
within constant, predictable bounds.  It was the memory explosion that
scared me.  And I suspect I'd pay a small performance cost (though it
would have to be small) for the kind of safety and flexibility the arrays
have.

Overhead = bigger initialization cost, memory footprint. It's not important if you are building a large array (which is what appender should be for), but the cost would add up if you had lots of little appenders that you didn't append much to. The point is, the builtin array optimizes performance for operations besides append, but allows appending as a convenience. Appender should optimize appending, sacrificing performance in other areas. It all depends on your particular application whether you should use appender or builtin arrays (or something entirely different/custom).

You haven't done much with it yet. When you start discovering how much D
takes care of, you will be amazed :)

I know. :-)

My needs are in some ways quite narrow -- numerical simulations in
interdisciplinary physics -- hence the C background, and hence the premium on performance. They're also not very big programs -- simple enough for me to generally keep a personal overview on the memory management, even though
with C++ that's usually all taken care of automatically (no new or delete
statements if I can avoid it).

There are many in the community that use D for numerical stuff. It's definitely not as mature as it could be, but getting better. Don is adding a lot of cool stuff to it, including a builtin exponent operator and arbitrary precision numbers.

The thing about D is it *can* be fast and unsafe, just as fast and unsafe
as C, but that's not the default.

That's apparent -- I mean, given that D wraps the whole C standard library,
I could basically write C code in D if I wanted, no?

Yes, but that's not what I meant ;) I mean, you can write your own types, like the Appender (or what the appender *should* be) that optimize the behavior of code to meet any needs. And it can do it with a much better syntax than C. D's template system and ability to make user-types seem like builtins I think is unparalleled in C-like languages.

-Steve

Reply via email to