Re: JIT pluggability (GC Interface)

2004-01-12 Thread Patrik Reali
Hi!

let me add my 0.02 cents (of euro). :-)

GC Interface:

let's divide the GCs in 2x2 categories:

* (TA) type-accurate: know where the pointers inside the objects are
* (CO) conservative: guess where the pointers inside the objects are
and

* (NC) non-concurrent: the GC is the only thread running during the 
collection
* (CI) concurrent / incremental: the GC and the mutator run 
(pseudo-)concurrently

These categories affect the interface between compiler, generated code, and 
the gc itself.

Static information (TA vs. CO):
* All GCs need a hook into the memory management to return the blocks to 
the free-list.
* All GCs need to access the whole heap
* TA GCs need a root-set from which the collection starts: the list of all 
static fields, all pointers on the stack of each thread, all pointers in 
the CPU registers (and those in the stack when the thread is preemted) [The 
registers depend in fact on the compiler optimizations: if some value may 
live longer in a register than in a variable then yes, otherwise if the 
compiler ensures that all pointers in the registers are also anchored 
somewhere in a variable, then no]
The list of static fields can be obtained through reflection; pointers on 
the stack are more complicated to get.
* TA GCs need a type descriptor containing the pointer offsets inside each 
type; arrays of references are also to be collected

Code Instrumentation (NC vs. CI):
* NC GCs do not need code instrumentation
* some CI GCs define code points where the GC can run
* some CI use read barriers (each pointer read operation is instrumented)
* some CI use write barriers (each pointer store operation is instrumented)
* some GCs use master-lists (each pointer points to the real pointer which 
is present only once, thus moving an object correspond to moving a 
master-pointer instead of each reference)

Example: Jaos + Aos Kernel
The Aos kernel provides memory allocation with a mark & sweep GC; the mark 
operation is implemented using the pointer rotation algorithm (Deutsch 
Waite Schorre). This is a "stop the world" non-concurrent GC (even in the 
multiprocessor version of the kernel).
The GC information is provided by the Jaos classloader (because it performs 
the field allocation during the load); each class has a module (a runtime 
compilation-unit) with the code, statics and a list of static pointers, and 
a type descriptor containing the pointer offsets, the method table, and the 
type hierarchy (those are then patched by the linker). The stack traversal 
is conservative.

A few pointers:
* http://www.memorymanagement.org
* Paul Wilson; Uniprocessor Garbage Collection Techniques; Proceedings of 
the Memory Management, International Workshop, Saint Malo, France, Sep 1992 
(postscript 764KB; ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps)
* Garbage Collection in Oberon: http://www.oberon.ethz.ch/gc/
* Garbage Collection in .NET: (part 1) 
 and (part 2) 

* Slides from lecture "System Software" on Garbage Collection: 
http://www.cs.inf.ethz.ch/37-201/slides/ssw03_memorymanagement_2.pdf

I hope this clarifies more than it confuses

-Patrik

--On Samstag, 10. Januar 2004 11:29 +0100 Chris Gray <[EMAIL PROTECTED]> 
wrote:

On Friday 09 January 2004 09:45, Sascha Brawer wrote:
Tom Tromey <[EMAIL PROTECTED]> wrote on Thu, 8 Jan 2004 17:48:59 -0700:
> [standard pluggable JIT interface]
This would indeed be quite nice, IMHO.
IMHO2 (that should probably be 3 or 4 by now).

I suspect that the interface to GC will be hard to define, though. There
are  several possible models, including:
  1. The mutator (JITted code) tells GC, "hey! I just mutated
something!",  whereupon GC either (a) aborts the current attempt to mark
the heap or (b)  makes a note to re-mark the affected objects before
doing a sweep.   2. No one is allowed to mutate anything during the
marking process, so a  protocol is devised which ensures this without
deadlock.
And that's only counting mark-and-sweep collectors ...

> Language choice for API.
>  The obvious choices being:
>  C lowest common denominator
>  C++   next-to-lowest common denominator :-)  provides some
>abstraction benefit, maybe
>  Java  using our own tools...
There are some users of Classpath who cannot execute any C code, such as
Jaos and JNodeOS. If the pluggable JIT interface was in Java, these
systems would be able to use it (assuming that someone writes a JITter in
Java, like IBM did with their JalapeƱo/Jikes RVM).
Also, I'd assume that the interface would be rather narrow in the sense
that it wouldn't get invoked very often in the direction VM --> JIT;
probably once for each class or method to be compiled. But the JIT would
presumably have to call a lot into the VM (for retrieving field offsets
etc.).
> ABI Issues

According to IBM's publications and web pages, Jikes RVM seems to nicely
cope with 

Re: JIT pluggability (GC Interface)

2004-01-12 Thread Chris Gray
On Monday 12 January 2004 21:11, Patrik Reali wrote:
> Hi!
>
> let me add my 0.02 cents (of euro). :-)

Yup, it's "soldi" time. :)

> GC Interface:
>
> let's divide the GCs in 2x2 categories:
>
> * (TA) type-accurate: know where the pointers inside the objects are
> * (CO) conservative: guess where the pointers inside the objects are
>
> and
>
> * (NC) non-concurrent: the GC is the only thread running during the
> collection
> * (CI) concurrent / incremental: the GC and the mutator run
> (pseudo-)concurrently

CI corresponds to my type 1, NC to type 2.

> These categories affect the interface between compiler, generated code, and
> the gc itself.

Is anybody using CO?

> Static information (TA vs. CO):
> * All GCs need a hook into the memory management to return the blocks to
> the free-list.
> * All GCs need to access the whole heap
> * TA GCs need a root-set from which the collection starts: the list of all
> static fields, all pointers on the stack of each thread, all pointers in
> the CPU registers (and those in the stack when the thread is preemted) [The
> registers depend in fact on the compiler optimizations: if some value may
> live longer in a register than in a variable then yes, otherwise if the
> compiler ensures that all pointers in the registers are also anchored
> somewhere in a variable, then no]
> The list of static fields can be obtained through reflection; pointers on
> the stack are more complicated to get.
> * TA GCs need a type descriptor containing the pointer offsets inside each
> type; arrays of references are also to be collected
>
> Code Instrumentation (NC vs. CI):
> * NC GCs do not need code instrumentation

Not so fast.

A NC GC typically needs to bring all (non-GC) threads to a "safe" state 
before it can run: a "safe" state is one in which all references from a 
thread to the heap are visible to GC. So some degree of instrumentation is 
needed in order to determine when a thread is "safe", and maybe also a 
mechanism to politely ask a thread to become "safe" at the earliest 
opportunity.
 
> * some CI GCs define code points where the GC can run
> * some CI use read barriers (each pointer read operation is instrumented)
> * some CI use write barriers (each pointer store operation is instrumented)
> * some GCs use master-lists (each pointer points to the real pointer which
> is present only once, thus moving an object correspond to moving a
> master-pointer instead of each reference)

AKA "handles".

>
> Example: Jaos + Aos Kernel
> The Aos kernel provides memory allocation with a mark & sweep GC; the mark
> operation is implemented using the pointer rotation algorithm (Deutsch
> Waite Schorre). This is a "stop the world" non-concurrent GC (even in the
> multiprocessor version of the kernel).
> The GC information is provided by the Jaos classloader (because it performs
> the field allocation during the load); each class has a module (a runtime
> compilation-unit) with the code, statics and a list of static pointers, and
> a type descriptor containing the pointer offsets, the method table, and the
> type hierarchy (those are then patched by the linker). The stack traversal
> is conservative.
>
> A few pointers:
> * http://www.memorymanagement.org
> * Paul Wilson; Uniprocessor Garbage Collection Techniques; Proceedings of
> the Memory Management, International Workshop, Saint Malo, France, Sep 1992
> (postscript 764KB; ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps)
> * Garbage Collection in Oberon: http://www.oberon.ethz.ch/gc/
> * Garbage Collection in .NET: (part 1)
>  and (part 2)
>  2/GCI2.asp>
> * Slides from lecture "System Software" on Garbage Collection:
> http://www.cs.inf.ethz.ch/37-201/slides/ssw03_memorymanagement_2.pdf

And if you don't mind killing trees: Richard Jones, Rafael Lins; Garbage 
Collection; John Wiley and Sons, 1996 (ISBN 0 471 94148 4).

> I hope this clarifies more than it confuses
>
> -Patrik
>
>
> --On Samstag, 10. Januar 2004 11:29 +0100 Chris Gray <[EMAIL PROTECTED]>
>
> wrote:
> > On Friday 09 January 2004 09:45, Sascha Brawer wrote:
> >> Tom Tromey <[EMAIL PROTECTED]> wrote on Thu, 8 Jan 2004 17:48:59 -0700:
> >> > [standard pluggable JIT interface]
> >>
> >> This would indeed be quite nice, IMHO.
> >
> > IMHO2 (that should probably be 3 or 4 by now).
> >
> > I suspect that the interface to GC will be hard to define, though. There
> > are  several possible models, including:
> >   1. The mutator (JITted code) tells GC, "hey! I just mutated
> > something!",  whereupon GC either (a) aborts the current attempt to mark
> > the heap or (b)  makes a note to re-mark the affected objects before
> > doing a sweep.   2. No one is allowed to mutate anything during the
> > marking process, so a  protocol is devised which ensures this without
> > deadlock.
> >
> > And that's only counting mark-and-sweep collectors ...