On Thu, Oct 17, 2013 at 11:31 AM, Ben Karel <[email protected]> wrote:
> On Thu, Oct 17, 2013 at 2:02 PM, Jonathan S. Shapiro <[email protected]>wrote: > >> Thanks, Ben. >> >> From the outside, looking from the GC-inclined perspective, it seems >> unfortunate that LLVM doesn't have a distinguished register class in the IR >> for object references as opposed to pointers. This would go a long way >> toward solving the stack map problem. On the other hand, I haven't thought >> hard enough about whether the presence of GCREAD/GCWRITE coupled with local >> dataflow might be enough. >> > > What stack map problem? > The stack map problem is to keep track of which words on the stack hold pointers at any given time. It's a map taking as input a PC and an SP and producing as output a set of frame-relative word offsets for the words that contain object references. Last time I checked, LLVM provided *zero* assist for this. The solution people were using was to adopt the threaded-stack approach originally proposed by Fergus Henderson [*]. In essence, the front end gathers all of the object references into an on-stack shadow stack, and then arranges to "leak" the address of the struct outside the procedure. The effect is that any temporary register containing an object reference becomes non-live at every procedure call and has to be re-loaded from the struct. This ensures that there are no live object references in registers at any time when a collection might occur. [*] Fergus Henderson, Accurate Garbage Collection in an Uncooperative Environment The overhead is just as high as the Boehm collector, which is well known to have much higher overhead than GCs designed for safe languages (as it should - Boehm is solving a *much* harder problem). The fact that the Henderson overhead is this high is primarily due to the inability to preserve object references in registers across procedure calls. The importance of that is *astonishing*. In a non-relocating collector, or in a collector that does not relocate root-referenced objects, it *would* be possible to preserve object referencing registers. In that model, the Henderson mechanism can be viewed as a clever trick for identifying roots that can be used in a compiler that is otherwise uncooperative. In effect, the front end is building the stack map explicitly. > > >> The "no objects in registers" rule, though, is *very* expensive. At >> first glance, it would seem to restrict the compiler to the Henderson-style >> approach. The overhead of those spills has been measured. It's pretty big. >> > > I'm unclear on what you think "no objects in registers" (again, that > word!) means, or what alternative you have in mind. Could you clarify? And > also cite the study you have in mind? > If you don't know which registers contain object references, you have two choices: 1. Ensure that any object which *might* be referenced by a register (conservatively) is not relocated. This precludes mark-compact and generally makes relocation a pain. 2. Ensure that no object references are allowed to remain in registers during a collection (this is what Henderson is doing). So basically, those pesky registers really *are* out to get you. :-) Though now that I think about it, the only objects we *really* need to avoid moving are the ones that are actually referenced from registers, and a tri-color technique can find those pretty easily. The first problem is that it leaves fragmentation in new space, where you *really* want to be able to play nice with the bump allocator. The second problem is that registers may hold interior pointers, and even posterior pointers, that are object references in disguise. This second problem is why the Henderson technique ensures that all object referencing registers are non-live. That also ensures that dependent interior temporaries die at procedure calls. > >> There is a stage in LLVM where hard registers are selected. Do the >> GCREAD/GCWRITE annotations survive to that point? >> > > No, the intrinsics are lowered to LLVM source-level IR. > In that case you have absolutely no way to know which hard registers hold object references, and no way to tell the optimizer that (e.g.) using an interior pointer to index over an array is a bad idea. That seems unfortunate. I agree in general about pointer-kind distinctions; the question is, should > those distinctions be made **in LLVM IR** or in a pre-LLVM frontend IR? The > natural inclination is to do it at the LLVM level, but I'd argue the > latter makes for a cleaner overall design. > They cannot be done correctly in a pre-LLVM front-end IR. If the mid-end IR doesn't preserve the register classifications, then it is impossible for the back end to know which hard registers hold object references, and *that* is what we wanted to know here. It's also impossible for the mid-end optimizer to know when optimizations involving interior pointers should be avoided. So I think it's fundamental. What's needed from an IR perspective is: 1. LDREF / STREF instructions 2. A new register class for object references. This is conceptually similar to the existing register classes that distinguish floating point and integer values. Then you go through and make sure that all of the inner pointer optimizations are properly sensitized to register class. The easiest way to ensure that is to make sure that the IR doesn't include an add operation that takes an IR register in the object-reference register class. Then at the end you tweak the hard register allocator to keep track of what's what. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
