Leandro Lucarella wrote:
Christopher Wright, el 12 de abril a las 17:54 me escribiste:
Absolutely.  When writing parallel code to do large scale data mining in D, the
lack of precision and multithreaded allocation are real killers.  My interests
are, in order of importance:

1.  Being able to allocate at least small chunks of memory without locking.
2.  Precise scanning of at least the heap.
3.  Collection w/o stopping the world.
4.  Moving GC so that allocations can be pointer bumps.
3. is my main goal right now. I think 1. can be done using thread-specific
free lists/pools. 2. Is possible too, but bigger changes are needed,
specially in the compiler side (1. and 3. can be completely done in the GC
implementation). 4. is not 100% possible because we can never have a 100%
precise GC, but can be very close if 2. is fixed =)
You can create StackInfo similar to TypeInfo, I suppose, and thus get an
entirely precise GC.

Sure. This is a big (compiler) change, and you probably have to drop
C compatibility (what would you do with C functions stacks frames without
StackInfo? How do you know it a stack frame is from an "untyped"
C function or a "typed" D one? Where do you search for that StackInfo?).
But it's definitely possible in theory.

Actually, it's not possible in D as it stands. Consider:
    union U {
        size_t i;
        void* p;
    }
There's no way for the GC to know whether an instance of this type is storing a pointer or an integer that happens to look like a pointer. So unless we're dropping support for unions (and void[]s as they exist currently), any GC needs to support some things that may either be pointers or non-pointers, and (implicitly?) pin allocations accordingly. So stack frames not described by a StackInfo instance can just be considered to consist of data that may or not be pointers, just like the union above.

Or you can pin anything that's referenced from the stack, and move
anything that is only referenced from the heap.

That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".

The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo can produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for user-defined aggregates, and LDC needs a compile-time #define to enable it (because it breaks linking the Tango runtime, IIRC).
(For other types, this fact it returns null is a simple library issue)

What I mean is you can be as precise as you want, but as long as union and
void[] is there, there always be "might be a pointer" fields, and cells

Oh, I hadn't read that part yet when I started typing this post :)

pointed by that type of fields should not be moved, ever. So, even after
a fresh collection, your heap can be still fragmented. You have to store
information about the "holes" and take care of them. This can be very
light too (in comparison with the actual allocation algorithm), but it can
never be as simple as a "pointer bump" (as requested by David =).

Well, it may technically be possible to move a heap object right before assignment to a union/void[] or passing to C if the compiler calls a library function before doing something like that. Then pinned objects could be allocated on a separate part of the heap that never gets moved (unless no more references in untyped memory are live, maybe?) and allocations could still be a pointer bump in the movable part of the heap.
I have no idea how efficient this would be, however. My guess would be not very.

So technically, you'll always have to deal with memory fragmentation in
D (I don't think anyone wants to drop unions and void[] =), and it's true
that it can be minimized to almost nothing. But since it's technically
possible, you can never get away from the extra complexity for managing
those rare cases.

Reply via email to