Re: Memory leak with dynamic array

Steven Schveighoffer Mon, 12 Apr 2010 09:40:11 -0700

On Mon, 12 Apr 2010 12:03:38 -0400, Joseph Wakeling<[email protected]> wrote:

I thought dev effort was now focusing back on GDC ... ? :-P


AFAIK, gdc hasn't been actively developed for a few years.

ldc, on the other hand, has regular releases. I think ldc may be thefuture of D compilers, but I currently use dmd since I'm using D2.

Steven Schveighoffer wrote:
The C++ example is reallocating memory, freeing memory it is no longer
using. It also manually handles the memory management, allocatinglargerand larger arrays in some algorithmically determined fashion (forexample,multiplying the length by some constant factor). This gives it an edgeinperformance because it does not have to do any costly lookup todetermine
if it can append in place, plus the realloc of the memory probably is
cheaper than the GC realloc of D.
Right.  In fact you get precisely 24 allocs/deallocs, each doubling the
memory reserve to give a total capacity of 2^23 -- and then that memoryisthere and can be used for the rest of the 100 iterations of the outerloop.
The shock for me was finding that D wasn't treating the memory like this
but was preserving each loop's memory (as you say, for good reason).


Yes, you get around this by preallocating.

D does not assume you stopped caring about the memory being pointed to
when it had to realloc. [...] You can't do the same thing with C++
vectors, when they reallocate, the memory they used to own could be
freed.  This invalidates all pointers and iterators into the vector,
but the language doesn't prevent you from having such dangling pointers.


I have a vague memory of trying to do something exactly like your example
when I was working with C++ for the first time, and getting bitten on the
arse by exactly the problem you describe.  I wish I could remember where.
I know that I found another (and possibly better) solution to do what I
wanted, but it would be nice to see if a D-ish solution would give me
something good.

It's often these types of performance discrepancies that critics point to(not that you are a critic), but it's the cost of having a morecomprehensive language. Your appetite for the sheer performance of alanguage will sour once you get bit by a few of these nasty bugs.

But D fosters a completely different way of thinking about solvingproblems. One problem with C++'s vector is it is a value type -- you mustpass a reference in order to avoid copying an entire vector. However, D'sarrays are a hybrid between reference and value type. Often, once you setdata in a vector/array, you never change it again. D allows ways toenforce this (i.e. immutable) and also allows you to pass around "slices"of your array with zero overhead (no copying). It results in someextremely high-performance code, which wouldn't be easy, or maybe evenpossible, with C++.

Take for instance a split function. In C++, I'd expect split(string x) toreturn a vector<string>. However, vector<string> makes a copy of eachpart of the string it has split out. D, however, can return references tothe original data (slices), which consume no overhead. The only extraspace allocated is the array to hold the string references. All this isalso completely safe!

You could then even modify the original string (assuming you were notusing immutable strings) in place! Or append to any one of the strings inthe array safely.

This must be fixed, the appender should be blazingly fast at appending
(almost as fast as C++), with the drawback that the overhead is higher.


Overhead = memory cost?  I'm not so bothered as long as the memory stays
within constant, predictable bounds.  It was the memory explosion that
scared me.  And I suspect I'd pay a small performance cost (though it
would have to be small) for the kind of safety and flexibility the arrays
have.

Overhead = bigger initialization cost, memory footprint. It's notimportant if you are building a large array (which is what appender shouldbe for), but the cost would add up if you had lots of little appendersthat you didn't append much to. The point is, the builtin array optimizesperformance for operations besides append, but allows appending as aconvenience. Appender should optimize appending, sacrificing performancein other areas. It all depends on your particular application whether youshould use appender or builtin arrays (or something entirelydifferent/custom).

You haven't done much with it yet. When you start discovering how muchD
takes care of, you will be amazed :)
I know. :-)

My needs are in some ways quite narrow -- numerical simulations in
interdisciplinary physics -- hence the C background, and hence thepremiumon performance. They're also not very big programs -- simple enough formeto generally keep a personal overview on the memory management, eventhough
with C++ that's usually all taken care of automatically (no new or delete
statements if I can avoid it).

There are many in the community that use D for numerical stuff. It'sdefinitely not as mature as it could be, but getting better. Don isadding a lot of cool stuff to it, including a builtin exponent operatorand arbitrary precision numbers.

The thing about D is it *can* be fast and unsafe, just as fast andunsafe
as C, but that's not the default.
That's apparent -- I mean, given that D wraps the whole C standardlibrary,
I could basically write C code in D if I wanted, no?

Yes, but that's not what I meant ;) I mean, you can write your own types,like the Appender (or what the appender *should* be) that optimize thebehavior of code to meet any needs. And it can do it with a much bettersyntax than C. D's template system and ability to make user-types seemlike builtins I think is unparalleled in C-like languages.


-Steve

Re: Memory leak with dynamic array

Reply via email to