Re: [drlvm] Class unloading support - tested one approach

Etienne Gagnon Wed, 08 Nov 2006 21:22:45 -0800

[First, let me say that, as I am not contributing a class unloading
*implementation* to drlvm, I will understand if the project was more
inclined to chose an actually contributed piece of code over a design
without contributed implementation. :-)]

There was a "-1" vote...  Hmmm...  As I voted "+1", I will enter the
arena... :-)

Before getting farther in this class unloading discussion, I would like
to get some clarifications about 3 things:

1- How does drlvm implement object finalization?  Doesn't this require
some "object" rescuing, much similarly to my proposal for correctly
implementing class unloading (the class loader rescuing thing)?

 If it does, then can somebody explain to me what's wrong with
 my proposal of setting, in normal operation, a "hard" reference to
 class loader objects, and then temporarily using weak, rescuable
 reference to enable class loader collection?  I don't see a performance
 hog there.  Rescuing a few class loaders (if any) and their related
 classes once per epoch shouldn't cost much!  I have yet to see a
 convincing explanation of how continuous collection of "object-vtables"
 would be more efficient...

 Really, even with Robin's proposal, this would work.  If a class loader
 gets into an unloadable state, then most likely, the class loader and
 all classes it has loaded will have migrated to an old generation.  So,
 as long as we set then end of a class unloading epoch at an old
 generation collection, then we can simply "weaken" the class loader
 reference during collection (only when the bit of all related vtables
 are unset), then apply the finalization-like rescue dance to class
 loaders.

 [*]This wouldn't affect any other operation, during other GC cycles, as
 Robin's unconditional bit/byte/word vtable write only serves to tell us
 whether a class loader has had living instances of its classes during
 the epoch.

2- I would like to read some "full" description of the object-vtable
proposal.  In particular, how does this proposal deal with the following:
 a) Preventing unloading of a class which has an active static method.
 b) Preventing unloading of a class which has an unsynchronized instance
method active, which has overwritten local variable 0 (or for which the
liveness analysis has detected the death of the reference to "this").

3- Why would it be so hard to add an unconditional write operation
during collection (e.g. during copying or marking of an object) in
drlvm?  A detailed technical explanation is welcome. :-)

  So far, this latest point (3-) seems the sole argument in favor of
  using the "object-vtables" approach.  Wouldn't the right fix, if
  it's currently really impossible to implement an unconditional
  write, be to extend the drlvm GC interface?  Isn't this a design
  deficiency in the GC interface?  No other argument, so far, seems
  to be in favor of the object vtable approach, unless I missed some.

As for Robin's attempt to deal with weak/hard reference to the class
loader object using a Reference Queue, I am not yet convinced of the
correctness of the approach...  [off the top of my head: potential
problem with static synchronized methods].  And, this would be probably
more work intensive (so, less efficient) than my proposal in 1-[*]
above.  It might also be tricky to identify all possible situations such
as Object.getClass() where some special code is needed to deal with a
problem situation.  I prefer clean, scope-limited code for dealing with
class unloading.  It's easier, that way, to activate or deactivate class
loading [dynamically!].

In summary, I would like to be convinced of the "completeness" and of
the "correctness" of all competing approaches.  I personally am, so far,
in favor of Robin's unconditional vtable bit/byte/word write idea, along
with an "adapted" version of my proposal for dealing with class loader
death (such as proposed in 1-[*] above).

Also, if somebody was able to find a "correctness" deficiency in my
proposal, then please let us know, so that we make sure this deficiency
is eliminated from all competing proposals.

By the way, what are the currently competing proposals?
1- object vtables
2- Robin/Gagnon proposal  (still finishing up some details ;-)
3- Is there a 3rd?

Which ones have existing implementations?  How "correct/complete" are
they?  Do we have access to some "human readable" (i.e. non-code) full
description of the algorithm?

Etienne

Robin wrote:
> On Thu, 2006-11-09 at 02:01 +0300, Ivan Volosyuk wrote:
> 
>>Robin,
>>
>>thank you for detailed description of the algorithm. IMHO, this was
>>the most complicated place of the whole story: how to have a weak
>>reference to classloader and still be able to get it alive again. This
>>shouldn't be performance critical part and is quite doable. I
>>absolutely agree with your estimations about tracing extra reference
>>per object. The approach you propose is more efficient and quite
>>elegant.
>>--
>>Ivan
> 
> 
> Thanks :)
> 
> 
>>On 11/8/06, Robin Garner <[EMAIL PROTECTED]> wrote:
>>
>>>Robin Garner wrote:
>>>
>>>>Aleksey Ignatenko wrote:
>>>>
>>>>>Robin.
>>>>>
>>>>>
>>>>>>OK, well how about keeping a weak reference to the >j.l.ClassLoader
>>>>>>object instead of a strong one.  When the reference >becomes (strong)ly
>>>>>>unreachable, invoke the class-unloading phase.
>>>>>
>>>>>
>>>>>If you have weak reference to j.l.Classloader - GC will collect it
>>>>>(with all
>>>>>appropriate jlClasses) as soon as there are no references to
>>>>>j.l.Classloaderand appropriate classes. But there is possible
>>>>>situation when there are some
>>>>>live objects of that classes and no references to jlClassloader and
>>>>>jlClasses. This will lead to unpredictable consequences (crash, etc).
>>>>>
>>>>>
>>>>>
>>>>>I want to remind that there 3 mandatory conditions of class unloading:
>>>>>
>>>>>1. j.l.Classloader instance is unreachable.
>>>>>
>>>>>2. Appropriate j.l.Class instances are unreachable.
>>>>>
>>>>>3. No object of any class loaded by appropriate class loader exists.
>>>>
>>>>Let me repeat.  I offer an efficient solution to (3).  I don't purport
>>>>to have a solution to (1) and (2).
>>>
>>>Let me just add:  This is because I don't think (1) or (2) are
>>>particularly difficult from a performance point of view, although I'm
>>>happy to accept that there may still be some subtle engineering challenges.
>>>
>>>Now this is just off the top of my head, but what about this for a design:
>>>- A j.l.ClassLoader maintains a collection of each of the classes it has
>>>loaded
>>>- A j.l.Class contains a pointer to its j.l.ClassLoader
>>>- A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
>>>The point of this is that a class loader and its classes are a 'self
>>>sustaining' data structure - if one element in it is reachable the whole
>>>thing is reachable.
>>>
>>>The VM maintains a weak reference to all its j.l.ClassLoader instances,
>>>and maintains a ReferenceQueue for weakly-reachable classloaders.
>>>ClassLoaders are placed on the ReferenceQueue if and only if they are
>>>unreachable from the heap (including via their j.l.Class objects).  Note
>>>this is an irreversible condition: objects that are unreachable can
>>>never become reachable again, except through very specific methods.
>>>
>>>When it sweeps the ReferenceQueue for unreachable classloaders, the VM
>>>places the unreachable classloaders in a queue of classloaders that are
>>>candidates for unloading.  This queue is part of the root set of the VM.
>>>  A classloader in this queue is unreachable from the heap, and can be
>>>unloaded when there are no objects of any class it has loaded.
>>>
>>>This is where my mechanism comes into play.
>>>
>>>If an object executes getClass() then its classloader is removed from
>>>the unloadable classloader queue, its weak reference gets recreated  and
>>>we're back at the initial state.  My guess is that this is a pretty
>>>infrequent method call.
>>>
>>>I think this stage of the algorithm is easy in performance terms -
>>>difficult in terms of proving correctness, but if you have an efficient
>>>reachability mechanism for classes I think the building blocks are
>>>there, and the subtleties are nothing that a talented engineer can't solve.
>>>
>>>
>>>I'm not 100% sure what your counter-proposal is: I recall 2 approaches
>>>from the mailing list:
>>>1) Each object has an additional word in its header that points back to
>>>    its j.l.Class object, and we proceed from here.
>>>
>>>Given that the mean object size is ~28 bytes, this proposal adds 14% to
>>>each object size.  This increases the frequency of GC by 14% and incurs
>>>a 14% slowdown.  Of course this is an oversimplification but a 14%
>>>slowdown is a pretty lousy starting point to argue from.
>>>
>>>2) The existing pointer in the GC header is traced during GC time.
>>>
>>>The average number of pointers per object (excluding the vtable) is
>>>between 1.5 and 2 for the majority of benchmarks I have looked at
>>>(footnote: if you know something different, drop me a line) (geometric
>>>mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
>>>additional reference per object will therefore increase the cost of GC
>>>by ~60% on average.  Again oversimplification but indicative.  If we
>>>assume that GC accounts for 10% of runtime (more or less depending on
>>>heap size), this is a runtime overhead of 6%.
>>>
>>>My proposal has been measured at ~1% overhead in GC time, or 0.1% in
>>>execution time (caveats as above).  If there is some complexity in
>>>establishing classloader reachability from this basis, I would assume it
>>>can easliy be absorbed.
>>>
>>>Therefore I think my proposal, while not complete, can form the basis of
>>>an efficient complete system for class unloading.
>>>
>>>(PS: I'd *love* to be proven wrong)
>>>
>>>cheers,
>>>Robin
>>>
>>>
>>>>Regards,
>>>>Robin
>>>>
>>>>
>>>>>
>>>>>Aleksey.
>>>>>
>>>>>
>>>>>On 11/8/06, Robin Garner <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>>Pavel Pervov wrote:
>>>>>>
>>>>>>>Robin,
>>>>>>>
>>>>>>>The kind of model I had in mind was along the lines of:
>>>>>>>
>>>>>>>>- VM maintains a linked list (or other collection type) of the
>>>>>>
>>>>>>currently
>>>>>>
>>>>>>>>loaded classloaders, each of which in turn maintains the
>>>>>>
>>>>>>collection of
>>>>>>
>>>>>>>>classes loaded by that type.  The sweep of classloaders goes
>>>>>>
>>>>>>something
>>>>>>
>>>>>>>>like:
>>>>>>>>
>>>>>>>>for (ClassLoader cl : classLoaders)
>>>>>>>>  for (Class c : cl.classes)
>>>>>>>>    cl.reachable |= c.vtable.reachable
>>>>>>>
>>>>>>>
>>>>>>>This is not enough. There are may be live j/l/Class'es and
>>>>>>>j/l/Classloader's
>>>>>>>in the heap. Even though no objects of any classes loaded by a
>>>>>>
>>>>>>particual
>>>>>>
>>>>>>>class loader are available in the heap, if we have live reference to
>>>>>>>j/l/ClassLoader itself, it just can't be unloaded.
>>>>>>
>>>>>>OK, well how about keeping a weak reference to the j.l.ClassLoader
>>>>>>object instead of a strong one.  When the reference becomes (strong)ly
>>>>>>unreachable, invoke the class-unloading phase.
>>>>>>
>>>>>>To me the key issue from a performance POV is the reachability of
>>>>>>classes from objects in the heap.  I don't pretend to have an answer to
>>>>>>the other questions---the performance critical one is the one I have
>>>>>>addressed, and I accept there may be many solutions to this part of the
>>>>>>question.
>>>>>>
>>>>>>
>>>>>>>I believe that a separate heap trace pass, different from the standard
>>>>>>>
>>>>>>>>GC, that visited vtables and reachable resources from there would
>>>>>>
>>>>>>also
>>>>>>
>>>>>>>>be a viable solution.  As mentioned in an earlier post, writing
>>>>>>
>>>>>>this in
>>>>>>
>>>>>>
>>>>>>>>MMTk (where a heap trace operation is a class that you can easily
>>>>>>>>subtype to do this) would be easy.
>>>>>>>>
>>>>>>>>One of the advantages of my other proposal is that it can be
>>>>>>
>>>>>>implemented
>>>>>>
>>>>>>>>in the VM independent of the GC to some extent.  This additional
>>>>>>>>mark/scan phase may or may not be easy to implement, depending on the
>>>>>>>>structure of DRLVM GCs, which is something I haven't explored.
>>>>>>>
>>>>>>>
>>>>>>>DRLVM may work with (potentially) any number of GCs. Designing class
>>>>>>>unloading the way, which would require mark&scan cooperation from
>>>>>>
>>>>>>GC, is
>>>>>>
>>>>>>>not
>>>>>>>generally a good idea (from my HPOV).
>>>>>>
>>>>>>That's what I gathered.  hence my proposal.
> 
> 
> 

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

signature.asc
Description: OpenPGP digital signature

Re: [drlvm] Class unloading support - tested one approach

Reply via email to