On Tuesday, 24 February 2015 at 08:39:02 UTC, Kagamin wrote:
On Tuesday, 24 February 2015 at 00:30:43 UTC, ketmar wrote:
On Mon, 23 Feb 2015 21:11:22 +0000, Sativa wrote:

How hard would it be to modify D's GC to do the following two things:

1. Scan the heap in the BG on another thread/cpu for compactification.

needs read/write barriers added to generated code. a major slowdown for
ALL memory access.

Only modifications of pointers, which introduce new cross-block dependencies (so that GC knows to recheck the new dependency). Other memory access goes without slowdown.

But this type of thinking is the reason why the current GC is in the state it is.

The compiler knows which pointers are "free" and which ones are "bound". Bound pointers are pointers that are not assigned freely by the user. e.g., a pointer to an array who's address never is arbitrarily set by the user is bound. The compiler knows where and how the pointer is assigned. Most pointers are this way.

Bound pointers are pointers the GC can easily clean up because it knows when and how they are used. In this way, if all pointers of a program were bound, the GC can work in the background and never pause the state to clean up. (essentially the compiler would need to insert special code) most pointers are bound pointers.

Free pointers are more difficult as they can, say, be randomly initiated and point to anywhere on the heap and have to be looked in a locked way. (to prevent them changing in the middle of some GC operation)

But if one distinguishes bound and free pointers(Easily done with a bit in the pointers) and has the compiler keep track of when free pointers are used(by having a dirty bit when they are written to), then one can more easily scan the heap in the background.

In fact, one could potentially get away from all synching issues by doing the following:

When ever free pointers are used a simple spin lock is used. The spin lock checks a flag in the free pointers table that signals that a pointer is being changed by the code. When this is true, the free pointers table is in a state of flux and can't be relied on. In the mean time, the GC can build up information about the heap for the bound pointers. It can figure out what needs to be changed, setup buffering(which can be done using bits in the pointer), etc all in the background because the bound pointers are "stable" and deterministically change.

When the free pointers table's dirty flag is unset it means that the free pointers are not changing in the program and the GC can lock the table using another flag. When the flag is set the spin lock kicks in and pauses the program while the GC is working on the free pointers table. (or to be more efficient, the program can yield to some other background task code)

By having multiple tables of free pointers one can reduce the overhead. The GC looks at on a piece at a time and locks on a fraction of the code at any point in time. The compiler can distribute the locks vs pages in an optimized way through profiling.





Reply via email to