Re: Memory leak with dynamic array

Steven Schveighoffer Sun, 11 Apr 2010 20:45:12 -0700

On Sun, 11 Apr 2010 12:50:11 -0400, Joseph Wakeling<[email protected]> wrote:

I was very happy to see that D _does_ have a 'reserve' function forarrays,
which I had been missing compared to C++ (it's not mentioned in the array
docs).

It's new. It probably should be mentioned in the spec, but it's adruntime thing.

Still, I don't think that pre-reserving the memory per se is theinfluencing
factor on the differences in performance.

No, and like bearophile said, the D array is a compromise betweenperformance and flexibility. There are amazing ways to use D arrays thatyou could never do with C++ vectors.

Note that in C++ the memory is not preassigned either.  The difference
between the performance of these pieces of code is striking -- on my
machine the D example takes about 70--75 seconds to run, whereas the
C++ example speeds through it in 10s or less.

The C++ example is reallocating memory, freeing memory it is no longerusing. It also manually handles the memory management, allocating largerand larger arrays in some algorithmically determined fashion (for example,multiplying the length by some constant factor). This gives it an edge inperformance because it does not have to do any costly lookup to determineif it can append in place, plus the realloc of the memory probably ischeaper than the GC realloc of D.

If you want to compare apples to apples (well, probably more like redapples to green apples), you need to do these things in a struct for D. Ihad thought the D appender class would do the trick, but as you statedbelow, it's even slower. This needs to be remedied.

D also uses about 20% more memory than the C++ even though the C++ code
declares a higher capacity for the vector (above 8 million) than D does
for the array (a little over 5 million).

D does not assume you stopped caring about the memory being pointed towhen it had to realloc. Therefore, it leaves it around in case you arestill using it. You can also expect more overhead in the GC because ittends to hang on to memory in case it wants to use it again, or because ithasn't collected it yet. This is true of most GC-based languages.


For example, you can do this in D:

int[] arr = new int[50];
int *arrelem = &arr[5];

for(int i = 0; i < 10000; i++)
  arr ~= i;

*arrelem = 12345; // valid.

You can't do the same thing with C++ vectors, when they reallocate, thememory they used to own could be freed. This invalidates all pointers anditerators into the vector, but the language doesn't prevent you fromhaving such dangling pointers. Using one of them can result in memorycorruption, one of the worst bugs. D tries to eliminate as much memorycorruption problems as possible.


I don't know if it was naive to assume that D's dynamic arrays would be
equivalent to vectors in C++.  But it's clear that the D array appender
~= is much slower than the C++ vector push_back().


But is much safer, and supports safe slicing.  Compromises.

Using an Appender class and put() (from std.array) is even slower,
despite the std.array docs recommending this over ~. :-(

This must be fixed, the appender should be blazingly fast at appending(almost as fast as C++), with the drawback that the overhead is higher.

It's disappointing because I'm loving so much of D's syntax, but I can
write far more efficient code (with less explicit memory management)
in C++ ...

You haven't done much with it yet. When you start discovering how much Dtakes care of, you will be amazed :)

array appending isn't the fastest operation, but it is safe, and safetycan be worth infinitely more than speed sometimes.

The thing about D is it *can* be fast and unsafe, just as fast and unsafeas C, but that's not the default.


-Steve

Re: Memory leak with dynamic array

Reply via email to