So I've been thinking pretty hard about interior object references, because
it's harder than I thought at first.

My initial thought about interior references was that we would need to use
two different referencer representations depending on whether the runtime
performed relocation or not. The problem is that in a relocating collector
we need a way to find the header of the outer object. That's actually
pretty ugly, because it makes object size and representation contingent on
the selection of GC technique. It's also a problem because "fat references"
are conservatively contagious. For example, it probably would mean that all
/this/ pointers would have to become fat references.

So I spent some more time thinking about it, and maybe I have some ideas.
I'd like feedback, alternatives, and pointers to documentation of other
solutions.

The balance of this email is *only* considering relocating collectors.

*Assumptions*

First, let me state my assumptions about on-heap objects from the GC point
of view:

   1. The initial word contains the mark bits, GC colors, and object size
   in words. This object is "owned" by the GC. It is not accessed by the
   mutator except in barrier code that is (1) generated by or (2) compiled in
   on behalf of the GC subsystem. I will refer to this as the GC header word.
   2. The GC assumes that the next word is a VTable pointer, *if and only if
   * the object is relocatable *and* this is a relocating GC
   implementation. A relocating GC can be implemented without this assumption,
   but we can't exploit the constant object optimization without it. A
   non-relocating GC does not require this assumption.
   3. Marking and forwarding are performed only on exterior objects. The
   fact that an object has *been* forwarded needs to be recorded on both
   interior and exterior object headers.

Note that as much as I'd prefer to get rid of them, I think *interior* objects
also require object header words. In principle these could be eliminated,
but I don't really see a clean way to do it.

*Problem Statement*

There are several problems that need to be solved for interior pointers:

   1. For various reasons, we need to be able to find the containing *
   exterior* object from an interior object reference.
   2. We need a way to *identify* interior pointers during marking and
   relocation.
   3. Marking an interior object needs to have the effect of marking the
   exterior object.
   4. Interior object references may point into an exterior object that has
   been forwarded. This needs to be handled by the barrier implementation.

The last two issues are pretty easy to address once we solve the first two,
so I'll focus on the first two.
*
*
*Identifying Interior Object References*

Identifying interior object references in large object space is easy,
because they won't point at the start of a page. For the same reason, it's
easy to locate both the containing object base pointer and the object
header for such an object. Note that we probably need to identify LOS
object references distinctively in any case, because the object headers for
those objects are likely to be represented differently from those in
conventional object spaces.

Identifying interior object references in the stack is not really necessary*.
*First, *all* references into the stack are effectively interior pointers
(into the stack frame), but second, we do not forward or relocate such
objects [*], so we don't really need to be concerned about whether an
object reference into the stack names the beginning of an object. The hard
cases are interior references that point into the nursery or the general
heap.

 [*] There's a possibility that on-stack objects get forwarded if we
implement lazy continuation capture, but that's doable, and I'm going to
ignore it here.


So the real problem is to *identify* interior references into the nursery
or the general heap.

I considered identifying interior pointers by setting a low-order bit in
the pointer value, but I think that's difficult:

   - It forces us to conservatively mask off that low bit on an awful lot
   of references. Note that we can't really do this by duplicating code, nor
   would we want to.
   - Once the bit is masked off, we end up with a lot of un-identified
   interior pointers in GC roots. We *can* limit them to appear *only* in
   GC roots, and distinguish those with static typing, but it adds a bunch of
   stuff to keep track of, and a bunch of main-line code to do the masking.

I also considered an approach that would use two virtual mappings of the
heap - one for interior references and the other for exterior references.
That approach:

   - Eliminates the masking code
   - Preserves the identifiability of interior references at all points, so
   we don't need static typing magic to identify them.
   - Cuts the available virtual space in half (not an issue on 64 bit
   systems).
   - Makes object EQ slightly more expensive.

On 32-bit systems, at least, I think that's not a solution because of the
reduction in available virtual address space.

So that leaves some sort of marker bit in the interior object header, which
could either be a distinguished bit 'I' or the uppermost bit of the GC's
object size field (see below).

*Finding Exterior from Interior*

Given an interior reference into the nursery or general heap, it's not *that
* hard to find the object base. We walk backwards through the line markers
to find the nearest preceding non-continued line, and then walk forward
using the object sizes until we find the object containing the reference of
interest. Alternatively - and probably better - note that the GC runtime
does not use the object size field in an interior GC header; we can use
this field to hold a (negative) offset to the containing object, which
allows us to walk outward to the exterior object directly.

If we use the reverse-offset approach, then the high-order bit of the
object size does double duty as a bit that identifies interior objects.

Personally, I don't think it matters whether we use two's complement
encoding or reserve an 'I' bit. We're going to lose a bit either way. I
think the choice probably gets made based on the barrier implementations.
There may turn out to be a reason to have the 'I' bit conveniently adjacent
to something else for some reason.

*Other Thoughts*
*
*
*Incremental Logging*

As I was writing through this, it occurred to me that interior header words
solve another problem: incremental logging. When an object is very large,
we *really* don't want to be logging the whole object. If unboxed objects
on the interior carry 'M' (modified and deferred) bits, then we have a
straightforward way to log (hierarchically structured) sub-portions of
objects. That's actually very nice, because it means that the *write* barrier
is almost entirely unaffected by the presence of interior objects.

*Deferred Read Barriers*
*
*
It also occurred to me that it *might* be possible to use the 'M' bits to
reduce forwarding overhead dynamically. The core idea is that we don't need
to do read barriers until somebody actually modifies the object. If we can
assume the presence of read locks, then I think we can do something like
the following:

   1. An object cannot be forwarded with the M bit set. That is: we require
   any deferred RC updates to be flushed.
   2. An object whose 'M' bit is *clear* can be forwarded eagerly.
   Immediately after forwarding, *both* copies of the object are valid.
   3. The first time that an update is performed on *either* copy, we know
   that:
      1. The 'M' bit is clear prior to this write barrier
      2. The mutator thread performing this store holds an exclusive lock
      on this object

If we can somehow make the following additional assumptions:

   - Upgrading an object from readable to read-write (by running a write
   barrier) causes all *read* references to the same object in the same
   mutator's current call frame to become dead.
   - Upgrading a *forwarded* object from readable to read-write *also* causes
   a stack return barrier to be introduced.
   - If mutator A holds a write lock on an object, then no other mutator
   holds a live *read* reference on the same object

Then *we don't need a read barrier at all*, which means (pleasantly) that
we don't need dynamic translation.

*Summary*
*
*

   1. I think there is a workable way to use a thin representation for
   interior pointers in a relocating collector
   2. It could be that we don't need a read barrier after all.
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to