MLT wrote:
Walter Bright Wrote:
Vladimir Panteleev wrote:
I don't know why it was decided to mark the contents of void[] as
"might have pointers". It makes no sense! Consider:
[...]
3) It's very rare in practice that the only pointer to your
object (which you still plan to access later) to be stored in a
void[]-allocated array!
Rare or common, it still would be a nasty bug lurking to catch someone.
The default behavior in D should be to be correct code. Doing
potentially unsafe things to improve performance should require extra
effort - in this case it would be either using the gc function to mark
the memory as not containing pointers, or storing them as ubyte[] instead.
As quite a newby, I can sum up what I understood as follows:
1. The idea of void[] is that you can put anything in it without casting.
2. Because of this, you might put pointers in a void[].
3. Since you have "legitimately" stored pointers, and we don't want to have the
GC throw away something that we still have valid pointers for, we have to have the GC
scan over void[] arrays for possible hits.
4. This pretty much means that any "big"(*) D program can not afford to put
uniformly distributed data in a void[] array, because the GC will stop working correctly
- it will not dispose of stuff that you don't need any more.
(*) where "big" means a program that creates and destroys a lot of objects.
So, currently if you want to use void[] to store non-pointers, you need to use
the gc function to mark the memory as not containing pointers.
A comment and a question. I agree that suddenly losing data because you stored a pointer in a void[] is worse than GC not working well. However, since GC in D is so automatic, almost any use of void[] to store non-pointer data will cause massive memory leaks and eventual program failure.
First, this is no problem if you are merely aliasing an existing array.
In order for it to be an issue, you must copy from some array to a
void[] -- for instance, appending to an existing void[], or .dup'ing a
void[] alias. (While a GC could work around the latter case, it would be
unsafe -- you can append something with pointers to a void[] copy of an
int[].)
I can see 4 solutions...
First, to not allow non-pointers to be stored in void[]. So non-pointers are
stored in ubyte[], pointers in void[]. Kinda looses the main point of using
void[].
Second, void[] is not scanned by GC, but you can mark it to be. This can cause
bugs if you store a pointer in void[], and later retreive it, but don't mark
correctly.
This is an unsafe option.
Third, void[] is scanned by GC, but you can mark it not to be. This can cause
memory leaks if you store complex data in void[] in a big program, and don't
handle GC marking correctly.
This is already available. If you know your array doesn't have pointers,
you can call GC.hasNoPointers(array.ptr).
This is a safe option.
Forth - somewhat more complex. Since the compiler knows exactly when a pointer
is stored in a void[] and when not, it would be possible to have the compiler
handle all by itself, as long as the property of having to be scanned by GC is
dirty - once a variable has it, any other that touches that variable gets the
property.
This isn't really the case unless you get some really invasive whole
program analysis (not available with D's compilation model, or if you
want to interact with code written in other languages, or if you want to
do runtime dynamic linking) or a really invasive runtime (think of
calling a method every time you access an array).
In point of fact, that's not going to be enough. You need to call the
runtime with every assignment, since you might be passing individual
ubytes around when they're part of a pointer and reassembling them
somewhere else.
Of these four solutions, the last 3 can still cause bugs if one stores both
pointers and data in the same void[] array, no matter how the memory is marked,
unless one does that marking on a very fine scale (is that possible?)
struct S
{
int i;
int* j;
}
You're screwed.
My conclusion from all this is either "don't use void[]", or "only use void[] to
store pointers" if you don't want bugs in a valid program.
Not bugs, but potential performance issues. And the advice should be
"don't allocate void[]", to split hairs.