Re: JEP 132: More-prompt finalization

Peter Levart Sat, 27 Jun 2015 02:33:26 -0700

Hi Kim,

Thanks again for looking into changed code. Some comments to yourcomments for easier understanding of my code (when you have time)...


On 06/26/2015 11:36 PM, Kim Barrett wrote:

On Jun 22, 2015, at 3:33 AM, Peter Levart <peter.lev...@gmail.com> wrote:

Hi Kim,

Sorry, I just noticed you'd sent an update a few days ago.

I don't have time to look carefully today, but a few quick comments.

On 06/09/2015 07:44 AM, Peter Levart wrote:

On 06/09/2015 04:03 AM, Kim Barrett wrote:

On May 31, 2015, at 3:32 PM, Peter Levart <peter.lev...@gmail.com>
  wrote:

So, for the curious, here's the improved prototype:
http://cr.openjdk.java.net/~plevart/misc/JEP132/ReferenceHandling/webrev.02/

While perusing the offered code (webrev.02) (not a careful review
yet), I think I've found a serious bug:

In Reference.java
in unhookPendingChunk
  253             // assert r.next == r;
  254             // move a chunk of pending/discovered references to a
  255             // temporary local r/next chain
...
  262                 rd.next = r;

I think that assertion suggested at line 253 does not hold, and line
262 may damage a queue.

[…]
I'm not sure why unhook is using the next field at all, rather than
just continuing to use the discovered field.  […]

Ops, you're right. Explicit enqueue-ing is what skipped my mind since I've been 1st 
working on finalizers and then generalized the solution... I'm afraid to use the 
'discovered' field since it is in the domain of VM and I think Java side can only reset 
it to null while holding the lock. I also would not like to add another field to 
Reference just to support "chunking". Perhaps the chunk could be formed as a 
re-usable array of references. Let me think about this some more to figure out how to 
solve it most elegantly...

I'll follow-up when I have something.

I think I was too afraid to use the 'discovered' field. I now think the 'discovered' field is the 
right field to use to form "chunks" of pending references. If I read the VM reference 
handling code correctly, then 'discovered' field is treated as a normal instance field as far as GC 
is concerned as soon as the Reference is not active any more (reference.next != null). All pending 
references found by Java code navigating 'pending' chain are already "de-activated" by GC 
when discovered (reference.next == reference). So it is safe to manipulate the 'discovered' field 
in Java code.

Yes, that's what I had in mind.  The collector must treat the
discovered field that way, because once notified and added to the
pending list, a reference is mostly out of the hands of the collector
and the discovered field is just the link for the pending list, and
from then on is the responsibility of Java code to deal with.


That's exactly my understanding too.

  Here's the modified prototype that uses 'discovered' field to form chunks of 
references:

     
http://cr.openjdk.java.net/~plevart/misc/JEP132/ReferenceHandling/webrev.05/

[…]
With this arrangement, I get even less CPU overhead when running the benchmark. 
Most probably because unhooking the chunk of pending references now doesn't 
involve writes to many fields - just two writes per chunk: the 'discovered' 
field of the last Reference in chunk is reset to null

That might be wrong, and the end marker for a list linked through
discovered is supposed to be a self-loop.  But I may be misremembering
or confusing with next-linked lists, and don't have time today to
examine the code.  I remember the discovery code cares about whether
that field is NULL or not, but it may not matter when the reference
has been deactivated (next field is non-NULL).

Yes, you might be confusing 'discovered' field with 'next' field. Somefacts about 'discovered' field:

- it starts as null. Used by GC to form internal lists of discoveredreferences which *are* terminated with a self-loop.- when GC publishes such internal list by hooking it onto static'pending' field, it makes sure the list is terminated by null - it alsode-activates such reference(s) by self-looping their 'next' field at thesame time.- therefore last element of the pending/discovered chain has discovered== null (unlike 'next' field in pending or last enqueued reference)- original reference handling code resets 'discovered' field to null assoon as it is unhooked from the 'pending' chain - it does unhookingone-by-one.

My changes to reference handling code treat 'discovered' fieldsimilarly. They resets it to null, albeit not in the same order asoriginal code. Last element in the chunk of unhooked pending referenceshas 'discovered' field reset immediately as soon as the chunk isunhooked. Other elements of chunk have the field reset to null as theyare processed or enqueued.


I think this does not break any invariants expected by GC to be maintained.

The prototype contains other things as well - not all of them might be suitable 
for inclusion in JDK9. They could be selectively removed from it or tailored. 
But I think they deserve to be mentioned and considered:

Using FJPool as a means to execute Finalizer(s) and other j.l.r.Cleaner(s) 
which enables easy scaling to more than one worker thread with less overhead 
than ReferenceQueue based processing.

Public j.l.r.Cleaner interface. […]

There is already some ongoing discussion of this kind of thing.  I
think we should be encouraging folks to transition away from
finalize() to reference-based cleanup.  But the amount of boiler-plate
code needed to use references directly makes that not so appealing.  I
did a little prototyping there, but have handed off to Roger Riggs.

Roger mentioned that some of the feedback he'd received on the little
prototype I gave him was about the need for some sort of worker thread
pool for scaling.

Using an internal instance of FJPool for example like in the presentedprototype? I have been thinking of alternatives:

ReferenceQueue based processing has one advantage over using a singlecommon pool of thread(s). It allows forming different queues fordifferent purposes. Each queue can be processed by separate thread(s) sothey are isolated. While a common pool of threads used for mixedclean-up purposes can be more efficient it is also more fragile whensome clean-up code misbehaves as it affects more references.

An alternative to having a public j.l.r.Cleaner interface presented inthe prototype which makes it's implementing Reference(s) "active"objects processed by a shared pool of threads could be addingj.l.r.Reference constructor(s) taking an Executor argument instead of aReferenceQueue. The problem with this approach is that Executor is aninterface which can be implemented by user and does not shield the solereference handler thread from executing user code. We would not want acustom misbehaving Executor to take down the whole JVM by blocking thereference handler thread. So perhaps, instead of Executor, theadditional Reference constructor argument could be a type that preventsuser from overriding it's task-submission code path, but still allowsuser to customize it in other aspects.

Public j.l.r.Finalizator is an example of a FinalReference based j.l.r.Cleaner. 
While [Phantom|Weak|Soft]Cleaner is more safe since it doesn't allow access to 
it's referent when fired, some JDK code (File[Input|Output]Stream for example) 
still uses finalization

See https://bugs.openjdk.java.net/browse/JDK-8080225

It seems that for some applications finalize-based cleanup has a
substantial performance cost that is a complete waste.  A closed
stream still needs to be finalized, delaying reclamation of the memory
and pushing the associated final-reference through the various lists
and threads until the finalize() function finally gets called and
finds it has nothing to do.

PhantomReference-based cleanup allows the phantom reference to be
deregistered, cleared, and dropped as part of a close-like operation,
completely eliminating reference processing costs for a closed object.

The j.l.r.Finalizator presented in the prototype is similar in thatrespect. If it was used in File[Input|Output]Stream instead of classicalfinalize() method, it would, in most cases, prevent the object frompassing reference processing steps. If a FinalReference is clear()ed aspart of a close-like operation, it is never discovered by GC as apending reference. j.l.r.Finalizator also eliminates the need for VM ->Java registration callback, as it is manually registered - usually inobject's constructor.

I have looked into what it would take to modifyFile[Input|Output]Stream, FileDiscriptor, etc. code to usePhantomReference-based cleanup instead of finalize() and it is not easy.If I manage to push it to the end I will present it here, but as far asI've come, I had to modify code in lots of places including redesigningthe API of JNI methods so that they are static, etc. We'll see if suchredesign is still manageable or if it is easier to use aj.l.r.Finalizator-like mechanism that still allows clean-up code toaccess the "finalizee" state but also allows FinalReferencede-registration on close().

Are any of the features of this prototype interesting for you and worth 
pursuing further?

This mostly appears to be a question for core-libs; that's where the
proposed changes are located.  GC is involved mostly to ensure the
internal APIs between the collector and reference processing are
maintained.

My personal opinion is that we should be working toward a day when we
can remove the finalize function, rather than trying to make the
infrastructure around it faster.  But clearly that day won't be soon,
if ever.

The infrastructure is shared for all Reference kinds. Improving itimpacts all reference processing, not only finalization. If it makesfinalization faster too, it can be viewed as just a coincidence ;-)

But I agree. We should move internal code away from finalization andalso give users viable alternatives that would make finalize() methodobsolete one day.


Regards, Peter

Re: JEP 132: More-prompt finalization

Reply via email to