Re: [bitc-dev] ARC in LLVM

Jonathan S. Shapiro Thu, 17 Oct 2013 13:45:57 -0700

On Thu, Oct 17, 2013 at 11:31 AM, Ben Karel <[email protected]> wrote:

> On Thu, Oct 17, 2013 at 2:02 PM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> Thanks, Ben.
>>
>> From the outside, looking from the GC-inclined perspective, it seems
>> unfortunate that LLVM doesn't have a distinguished register class in the IR
>> for object references as opposed to pointers. This would go a long way
>> toward solving the stack map problem. On the other hand, I haven't thought
>> hard enough about whether the presence of GCREAD/GCWRITE coupled with local
>> dataflow might be enough.
>>
>
> What stack map problem?
>

The stack map problem is to keep track of which words on the stack hold
pointers at any given time. It's a map taking as input a PC and an SP and
producing as output a set of frame-relative word offsets for the words that
contain object references.

Last time I checked, LLVM provided *zero* assist for this. The solution
people were using was to adopt the threaded-stack approach originally
proposed by Fergus Henderson [*]. In essence, the front end gathers all of
the object references into an on-stack shadow stack, and then arranges to
"leak" the address of the struct outside the procedure. The effect is that
any temporary register containing an object reference becomes non-live at
every procedure call and has to be re-loaded from the struct. This ensures
that there are no live object references in registers at any time when a
collection might occur.

   [*] Fergus Henderson, Accurate Garbage Collection in an Uncooperative
Environment

The overhead is just as high as the Boehm collector, which is well known to
have much higher overhead than GCs designed for safe languages (as it
should - Boehm is solving a *much* harder problem). The fact that the
Henderson overhead is this high is primarily due to the inability to
preserve object references in registers across procedure calls. The
importance of that is *astonishing*.

In a non-relocating collector, or in a collector that does not relocate
root-referenced objects, it *would* be possible to preserve object
referencing registers. In that model, the Henderson mechanism can be viewed
as a clever trick for identifying roots that can be used in a compiler that
is otherwise uncooperative. In effect, the front end is building the stack
map explicitly.

>
>
>> The "no objects in registers" rule, though, is *very* expensive. At
>> first glance, it would seem to restrict the compiler to the Henderson-style
>> approach. The overhead of those spills has been measured. It's pretty big.
>>
>
> I'm unclear on what you think "no objects in registers" (again, that
> word!) means, or what alternative you have in mind. Could you clarify? And
> also cite the study you have in mind?
>

If you don't know which registers contain object references, you have two
choices:

   1. Ensure that any object which *might* be referenced by a register
   (conservatively) is not relocated. This precludes mark-compact and
   generally makes relocation a pain.
   2. Ensure that no object references are allowed to remain in registers
   during a collection (this is what Henderson is doing).

So basically, those pesky registers really *are* out to get you. :-)

Though now that I think about it, the only objects we *really* need to
avoid moving are the ones that are actually referenced from registers, and
a tri-color technique can find those pretty easily. The first problem is
that it leaves fragmentation in new space, where you *really* want to be
able to play nice with the bump allocator. The second problem is that
registers may hold interior pointers, and even posterior pointers, that are
object references in disguise. This second problem is why the Henderson
technique ensures that all object referencing registers are non-live. That
also ensures that dependent interior temporaries die at procedure calls.

>
>> There is a stage in LLVM where hard registers are selected. Do the
>> GCREAD/GCWRITE annotations survive to that point?
>>
>
> No, the intrinsics are lowered to LLVM source-level IR.
>

In that case you have absolutely no way to know which hard registers hold
object references, and no way to tell the optimizer that (e.g.) using an
interior pointer to index over an array is a bad idea. That seems
unfortunate.

I agree in general about pointer-kind distinctions; the question is, should
> those distinctions be made **in LLVM IR** or in a pre-LLVM frontend IR? The
> natural inclination is to do it at the LLVM level,  but I'd argue the
> latter makes for a cleaner overall design.
>

They cannot be done correctly in a pre-LLVM front-end IR. If the mid-end IR
doesn't preserve the register classifications, then it is impossible for
the back end to know which hard registers hold object references, and *that* is
what we wanted to know here. It's also impossible for the mid-end optimizer
to know when optimizations involving interior pointers should be avoided.

So I think it's fundamental. What's needed from an IR perspective is:

   1. LDREF / STREF instructions
   2. A new register class for object references. This is conceptually
   similar to the existing register classes that distinguish floating point
   and integer values.

Then you go through and make sure that all of the inner pointer
optimizations are properly sensitized to register class. The easiest way to
ensure that is to make sure that the IR doesn't include an add operation
that takes an IR register in the object-reference register class.

Then at the end you tweak the hard register allocator to keep track of
what's what.

shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] ARC in LLVM

Reply via email to