Hi Dmitry,

ReferenceQueue is not really a queue, but a LIFO stack. Scanner is walking the stack from top (the 'head') to bottom (the last element pointing to itself). When poller(s) overtake the scanner, it actually means that the top of the stack has been eaten under scanner's feet while trying to grab it. Restarting from 'head' actually means 'catching-up' wit poller(s) and trying to keep up with them. We don't get the 'eaten' elements, but might be lucky to get some more food until it's all eaten. Usually we will get all of the elements, since poller(s) must synchronize and do additional work, so they are slower and there's also enqueuer(s) that push elements on the top of the stack so poller(s) must eat those last pushed elements before they can continue chasing the scanner...

When scanner is overtaken by poller, then there is a chance the scanner will get less elements than there were present in the stack if a snapshot was taken, so catching-up by restarting from 'head' tries to at least recover some of the rest of the elements of that snapshot before they are gone.

Does this make more sense now?

Regards, Peter

On 05/14/2015 07:41 PM, Dmitry Samersoff wrote:
Peter,

Could we just bail out on r == r.next?

It gives us less accurate result, but I suspect that If we restart from
head we need to flush all counters.

As far I understand queue poller removes items one by one from a queue
end so if we overtaken by queue poller we can safely assume that
we are at the end of the queue.

Is it correct?

-Dmitry

On 2015-05-14 10:50, Peter Levart wrote:
Hi Derek,

On 05/13/2015 10:34 PM, Derek White wrote:
Hi Peter,

I don't have smoking gun evidence that your change introduces a bug,
but I have some concerns...
I did have a concern too, ...

On 5/12/15 6:05 PM, Peter Levart wrote:
Hi Dmitry,

You iterate the queue then, not the unfinalized list. That's more
logical.

Holding the queue's lock may pause reference handler and finalizer
threads for the entire time of iteration. This can blow up the
application. Suppose one wants to diagnose the application because he
suspects that finalizer thread hardly keeps up with production of
finalizable instances. This can happen if the allocation rate of
finalizable objects is very high and/or finalize() methods are not
coded to be fast enough. Suppose the queue of Finalizer(s) builds up
gradually and is already holding several million objects. Now you
initiate the diagnostic command to print the queue. The iteration
over and grouping/counting of several millions of Finalizer(s) takes
some time. This blocks finalizer thread and reference handler thread.
But does not block the threads that allocate finalizable objects. By
the time the iteration is over, the JVM blows up and application
slows down to a halt doing GCs most of the time, getting OOMEs, etc...

It is possible to iterate the elements of the queue for diagnostic
purposes without holding a lock. The modification required to do that
is the following (havent tested this, but I think it should work):

http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.01/

One issue to watch out for is the garbage collectors inspect the
Reference.next field from C code directly (to distinguish active vs.
pending, iterate over oops, etc.). You can search hotspot for
java_lang_ref_Reference::next*, java_lang_ref_Reference::set_next*.

Your change makes "inactive" References superficially look like
"enqueued" References. The GC code is rather subtle and I haven't yet
seen a case where it would get confused by this change, but there
could easily be something like that lurking in the GC code.
...but then I thought that GC can in no way treat a Reference
differently whether it is still enqueued in a ReferenceQueue or already
dequeued (inactive) - responsibility is already on the Java side.
Currently the definition of Reference.next is this:

     /* When active:   NULL
      *     pending:   this
      *    Enqueued:   next reference in queue (or this if last)
      *    Inactive:   this
      */
     @SuppressWarnings("rawtypes")
     Reference next;

We see that, unless GC inspects all ReferenceQueue instances and scans
their lists too, the state of a Reference that is enqueued as last in
list is indistinguishable from the state of inactive Reference. So I
deduced that this distinction (enqueued/inactive) can't be established
solely on the value of .next field ( == this or != this)...

But I should inspect the GC code too to build a better understanding of
that part of the story...

Anyway. It turns out that there is already enough state in Reference to
destinguish between it being enqueued as last in list and already
dequeued (inactive) - the additional state is Reference.queue that
transitions from ReferenceQueue.ENQUEUED -> ReferenceQueue.NULL in
ReferenceQueue.reallyPoll. We need to just make sure the two fields
(r.next and r.queue) are assigned and read in correct order. This can be
achieved either by making Reference.next a volatile field or by a couple
of explicit fences:

http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.02/

The assignment of r.queue to ReferenceQueue.ENQUEUED already happens
before linking the reference into the queue's head in
ReferenceQueue.enqueue():

             r.queue = ENQUEUED;
             r.next = (head == null) ? r : head;
             head = r;

Both stores are volatile.

I also suggest the addition to the ReferenceQueue to be contained
(package-private) and as generic as possible. That's why I suggest
forEach iteration method with no concrete logic.

It would be possible to encapsulate the entire logic into a special
package-private class (say java.lang.ref.DiagnosticCommands) with
static method(s) (printFinalizationQueue)... You could just expose a
package-private forEach static method from Finalizer and code the
rest in DiagnosticCommands.
That's good for encapsulation. But I'm concerned that if "forEach" got
exposed beyond careful use in DiagnosticCommands, and the Referents
were leaked back into the heap, then we could get unexpected object
resurrection after finalization. This isn't a bug on it's own - any
finalizer might resurrect its object if not careful, but ordinarily
/only/ the finalizer could resurrect the object. This could invalidate
application invariants?
Well, it all stays in the confines of package-private API - internal to
JDK. Any future additional use should be reviewed carefully. Comments on
the forEach() method should warn about that.

I agree that using a lock over the ReferenceQueue iteration opens up
/another/ case of diagnostics affecting application behavior. And
there are pathological scenarios where this gets severe. This is
unfortunate but not uncommon. There is enough complication here that
you should be sure that the fix for diagnostics performance doesn't
introduce subtle bugs to the system in general. A change in this area
should get a thorough review from both the runtime and GC sides.
Is the webrev.02 proposed above more acceptable in that respect? It does
not introduce any logical changes to existing code.

Better yet, the reference handling code in GC and runtime should
probably be thrown out and re-written, but that's another issue :-)
I may have some proposals in that direction. Stay tuned.

Regards, Peter

- Derek

Regards, Peter


On 05/12/2015 07:10 PM, Dmitry Samersoff wrote:
Everybody,

Updated version:

http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.03/

Now it iterates over queue and output result sorted by number of instances.

-Dmitry

On 2015-05-07 00:51, Derek White wrote:
Hi Dmitry, Staffan,

Lots of good comments here.

On the topic of what list should be printed out, I think we should focus
on objects waiting to be finalized - e.g. the contents of the
ReferenceQueue. It's more of a pain to walk the ReferenceQueue, but you
could add a summerizeQueue(TreeMap<String, Integer>) method, or a
general iterator and lambda.

A histogram of objects with finalize methods might also be interesting,
but you can get most of the same information from a heap histogram
(ClassHistogramDCmd, and search for count of Finalizer instances).

BTW, by only locking the ReferenceQueue, we should only be blocking the
FinalizerThread from making progress (and I think there's a GC thread
that runs after GC that sorts found References objects that need
processing into their respective ReferenceQueues). But locking the
"unfinalized" list blocks initializing any object with a finalize() method.

The sorting suggestion is a nice touch.

  - Derek White, GC team

On 5/5/15 10:40 AM, Peter Levart wrote:
Hi Dmitry, Staffan,

On 05/05/2015 12:38 PM, Staffan Larsen wrote:
Dmitry,

I think this should be reviewed on hotspot-gc and core-libs-dev as
well considering the changes to Finalizer.

I’m a little worried about the potentially very large String that is
returned from printFinalizationQueue(). A possible different approach
would be to write printFinalizationQueue() in C++ inside Hotspot.
That would allow for outputting one line at a time. The output would
still be saved in memory (since the stream is buffered), but at least
the data is only saved once in memory, then. It would make the code a
bit harder to write, so its a question of how scared we are of
running out of memory.
If the output is just a histogram of # of instances per class name,
then it should not be too large, as there are not many different
classes overriding finalize() method (I counted 72 overriddings of
finalize() method in the whole jdk/src tree).

I'm more concerned about the fact that while traversing the list, a
lock is held, which might impact prompt finalization(). Is it
acceptable for diagnostic output to impact performance and/or
interfere with synchronization?

It might be possible to scan the list without holding a lock for
diagnostic purposes if Finalizer.remove() didn't set the removed
Finalizer.next pointer to point back to itself:

     private void remove() {
         synchronized (lock) {
             if (unfinalized == this) {
                 if (this.next != null) {
                     unfinalized = this.next;
                 } else {
                     unfinalized = this.prev;
                 }
             }
             if (this.next != null) {
                 this.next.prev = this.prev;
             }
             if (this.prev != null) {
                 this.prev.next = this.next;
             }
             // this.next = this; must not be set so that we can
traverse the list unsynchronized
             this.prev = this;   /* Indicates that this has been
finalized */
         }
     }

For detecting whether a Finalizer is already processed, the 'prev'
pointer could be used instead:

     private boolean hasBeenFinalized() {
         return (prev == this);
     }

Also, to make sure a data race dereferencing 'unfinalized' in
unsynchronized printFinalizationQueue() would get you a fully
initialized Finalizer instance (in particular the next pointer), you
would have to make the 'unfinalized' field volatile or insert an
Unsafe.storeFence() before assigning to 'unfinalized':

     private void add() {
         synchronized (lock) {
             if (unfinalized != null) {
                 this.next = unfinalized;
                 unfinalized.prev = this;
             }
             // make sure a data race dereferencing 'unfinalized'
             // in printFinalizationQueue() does see the Finalizer
             // instance fully initialized
             // (in particular the 'next' pointer)
             U.storeFence();
             unfinalized = this;
         }
     }


By doing these modifications, I think you can remove
synchronized(lock) {} from printFinalizationQueue().

I see you are traversing the ‘unfinalized’ list in Finalizer, whereas
the old SA code for ‘-finalizerinfo' traverses the ‘queue’ list. I am
not sure of the difference, perhaps someone from the GC team can help
out?
The 'queue' is a ReferenceQueue of Finalizer (FinalReference)
instances pending processing by finalizer thread because their
referents are eligible for finalization (they are not reachable any
more). The 'unfinalized' is a doubly-linked list of all Finalizer
instances for which their referents have not been finalized yet
(including those that are still reachable and alive). The later serves
two purposes:
- it keeps Finalizer instances reachable until they are processed
- it is a source of unfinalized instances for
running-finalizers-on-exit if requested by
System.runFinalizersOnExit(true);

So it really depends on what one would like to see. Showing the queue
may be interesting if one wants to see how the finalizer thread is
coping with processing the finalize() invocations. Showing unfinalized
list may be interesting if one wants to know how many live +
finalization pending instances are there on the heap that override
finalize() method.

Regards, Peter

For the output, it would be a nice touch to sort it on the number of
references for each type. Perhaps outputting it more like a table
(see the old code again) would also make it easier to digest.

In diagnosticCommand.hpp, the new commands should override
permission() and limit access:

    static const JavaPermission permission() {
      JavaPermission p = {"java.lang.management.ManagementPermission",
                          "monitor", NULL};
      return p;
    }

The two tests don’t validate the output in any way. Would it be
possible to add validation? Perhaps hard to make sure an object is on
the finalizer queue… Hmm.

Thanks,
/Staffan


On 5 maj 2015, at 12:01, Dmitry Samersoff
<dmitry.samers...@oracle.com> wrote:

Everyone,

Please review the fix:

http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.01/

heap dcmd outputs the same information as SIGBREAK

finalizerinfo dcmd outputs a list of all classes in finalization queue
with count

-Dmitry

--
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

Reply via email to