I like it.  I don't fully understand the fine details yet.  But overall it
seems to be a clean design.  Maybe it makes sense for someone to prototype
this in drlvm.

On 10/30/06, Etienne Gagnon <[EMAIL PROTECTED]> wrote:

Hi all,

Here's a more structured proposal for a simple and effective
implementation of class unloading support.

In accordance with Section 2.17.8 of the JVM spec, class unloading (and
its related native resource cleanup) can only happen when the class
loader instance becomes unreachable.  For this to happen, we put in
place the following things:

1- Each class loader is represented by some VM internal structure.
[We'll call it the "class loader structure"].

2- Each class loader internal structure, except (optionally) the
bootstrap class loader, maintains a weak reference to an object
instance of class ClassLoader (or some subclass).  The Java instance
has some opaque pointer back to the internal VM structure.   The Java
instance is usually created before the internal VM structure.  The
instance constructor is usually in charge of creating the internal VM
structure.  [We'll call it the "class loader instance"]

3- Each class loader instance maintains a collection of loaded classes.
A class/interface is never removed from this collection.  This
collection maintains "hard" (i.e. "not weak") references to
classes/interfaces.

4- [Informative] A class loader instance is also most likely to maintain
a collection of classes for which it has "initiated" class loading.
This collection should use hard references (as weak references won't
lead to earlier class loading).

5- Each class loader instance maintains a hard reference to its parent
class loader.  This reference is (optionally) null if the parent is the
bootstrap class loader.

6- Each j.l.Class instance maintains a hard reference to the class
loader instance of the class loader that has loaded it.  [This is not
the "initiating" loaders, but really the "loading" loader].

7- Each class loader structure maintains a set of boolean flags, one
flag per "non-nursery" garbage collected area (even when thread-local
heaps are used).  The flag is set when an instance of a class loaded by
this class leader is moved into the related GC-area.  The flag is unset
when the GC-area is emptied, or (optionally) when it can be determined
that no instance of a class loaded by this class loader remains in the
GC-area.  This is best implemented as follows: a) use an unconditional
write of "true" in the flag every time an object is moved into the
GC-area by the garbage collector, b) unset the related flag in "all"
class loader structures just before collecting a GC-area, then setting
the flag back when an object survives in the area.

8- Each method invocation frame maintains a hard reference to either its
surrounding instance (in case of instance methods, i.e. (invokevirtual,
invokeinterface, and invokespecial) or its surrounding class
(invokestatic).  This is already required for synchronized methods
(it's not a good idea to allow the instance to be collected before the
end of a synchronized instance method call; yep, learned the hard way
in SableVM...)  So, the "overhead" is quite minimal.  The importance of
this is in the correctness of not letting a class loader to die while a
static/instance method of a class loaded by it is still active, leading
to premature release of native resources (such as jitted code, etc.).

9- A little magic is required to prevent premature collection of a class
loader instance and its loaded j.l.Class instances (see [3-] above), as
object instances do not maintain a hard reference to their j.l.Class
instance, yet we want to preserve the correctness of Object.getClass().

So, the simplest approach is to maintain a hard reference in a class
loader structure to its class loader instance (in addition to the weak
reference in [2-] above).  This reference is kept always set (thus
preventing collection of the class loader instance), except when *all*
the following conditions are met:
a) All nurseries are empty.
b) All GC-area flags are unset.

Actually, for making this practical and preserving correctness, it's a
little trickier.  It requires a 2-step process, much like the
object-finalization dance.  Here's a typical example:

On a major collection, where all nurseries are collected, and some (but
not necessary all) other GC-areas are collected, we do the following
sequence of actions:
a) All class loader structures are visited.  All flags related to
  non-nursery GC-areas that we intend to collect are unset.  If this
  leads to *all* flags to be unset, the hard reference to the class
  loader instance is set to NULL (thus enabling, possibly, the
  collection of the class loader instance).

b) The garbage collection cycle is started and proceeds as usual.
  Note that the work mandated in [7-] above is also done, which might
  lead to setting back some flags in class loader structures that had
  all their flags unset in [a)].

c) After the initial garbage collection is applied, and just before
  the usual treatment of weak references (where they are set to NULL
  when pointing to a collected object), all class loader structures
  are visited again.  The hard pointer of every class loader structure
  that has any flag set is set back to point to the class loader
  instance if it was NULL (same as how object instances are preserved
  for finalization).

d) If [c)] has triggered any change (i.e. it mandates the survival of
  additional class loader instances that were due to die), the garbage
  collection cycle is "extended" to rescue the additional class loader
  instances and all objects they can reach.

e) Any additional work of the garbage collection cycle is done (e.g.
  soft, weak, and phantom references, finalization handling).

f) All class loader structures are visited again.  Every structure for
  which the weak reference has NOT been set to NULL has its hard
  reference set to the weak reference target.  Every structure for
  which the weak reference has been set to NULL is now ready to be
  unloaded (i.e. release all of its native resources, including jitted
  code, class information, method information, vtables, and so on).


In addition,I highly recommend using the approach proposed in Chapter 3
of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
class-loader related memory.  It has many advantages:

1- No "header space" overhead for very small allocations.  [This is a
typical "hidden" space overhead of malloc() implementations to allow
for later free() calls].
2- Minimal memory fragmentation.  [Allocation only happens in large
  blocks].
3- Simple and very efficient allocation.  [No overhead for complex
  management of freeing small areas later].
4- Efficient freeing of large memory blocks on class unloading.
5- Possibility of clever usage of this memory; see Chapter 4 of the same
  document for the implementation of sparse interface virtual tables
  enabling invokeinterface at the simple cost of invokevirtual.  :-)


I hope this is useful to both projects [drlvm][sablevm]  :-)

Etienne

(C) 2006 by Etienne M. Gagnon <[EMAIL PROTECTED]>
This text is licensed under the Apache License, Version 2.0.

[You may add this document in svn;  I am willing to sign the required
Apache agreement to make it so, if you intend to use it in drlvm's
implementation].

--
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/





--
Weldon Washburn
Intel Enterprise Solutions Software Division

Reply via email to