Re: RFR(M,v6): JDK-8059036 : Implement Diagnostic Commands for heap and finalizerinfo

Dmitry Samersoff Fri, 15 May 2015 13:25:10 -0700

Everybody,

Please review updated version.


http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.06/

JDK part provided by Peter Levart, so all credentials belongs to him.

-Dmitry

On 2015-05-14 23:35, Peter Levart wrote:
> 
> 
> On 05/14/2015 10:11 PM, Peter Levart wrote:
>> Hi Dmitry,
>>
>> ReferenceQueue is not really a queue, but a LIFO stack. Scanner is
>> walking the stack from top (the 'head') to bottom (the last element
>> pointing to itself). When poller(s) overtake the scanner, it actually
>> means that the top of the stack has been eaten under scanner's feet
>> while trying to grab it. Restarting from 'head' actually means
>> 'catching-up' wit poller(s) and trying to keep up with them. We don't
>> get the 'eaten' elements, but might be lucky to get some more food
>> until it's all eaten. Usually we will get all of the elements, since
>> poller(s) must synchronize and do additional work, so they are slower
>> and there's also enqueuer(s) that push elements on the top of the
>> stack so poller(s) must eat those last pushed elements before they can
>> continue chasing the scanner...
>>
>> When scanner is overtaken by poller, then there is a chance the
>> scanner will get less elements than there were present in the stack if
>> a snapshot was taken, so catching-up by restarting from 'head' tries
>> to at least recover some of the rest of the elements of that snapshot
>> before they are gone.
> 
> Also, the chance that the scanner is overtaken by poller is greater at
> the start of race. The scanner and poller start from the same spot and
> as we know threads are "jumpy" so it will happen quite frequently that a
> poller jumps before scanner. So just giving-up when overtaken might
> return 0 or just a few elements when there are millions there in the
> queue. When scanner finally gets a head start, it will usually lead the
> race to the end.
> 
> Peter
> 
>>
>> Does this make more sense now?
>>
>> Regards, Peter
>>
>> On 05/14/2015 07:41 PM, Dmitry Samersoff wrote:
>>> Peter,
>>>
>>> Could we just bail out on r == r.next?
>>>
>>> It gives us less accurate result, but I suspect that If we restart from
>>> head we need to flush all counters.
>>>
>>> As far I understand queue poller removes items one by one from a queue
>>> end so if we overtaken by queue poller we can safely assume that
>>> we are at the end of the queue.
>>>
>>> Is it correct?
>>>
>>> -Dmitry
>>>
>>> On 2015-05-14 10:50, Peter Levart wrote:
>>>> Hi Derek,
>>>>
>>>> On 05/13/2015 10:34 PM, Derek White wrote:
>>>>> Hi Peter,
>>>>>
>>>>> I don't have smoking gun evidence that your change introduces a bug,
>>>>> but I have some concerns...
>>>> I did have a concern too, ...
>>>>
>>>>> On 5/12/15 6:05 PM, Peter Levart wrote:
>>>>>> Hi Dmitry,
>>>>>>
>>>>>> You iterate the queue then, not the unfinalized list. That's more
>>>>>> logical.
>>>>>>
>>>>>> Holding the queue's lock may pause reference handler and finalizer
>>>>>> threads for the entire time of iteration. This can blow up the
>>>>>> application. Suppose one wants to diagnose the application because he
>>>>>> suspects that finalizer thread hardly keeps up with production of
>>>>>> finalizable instances. This can happen if the allocation rate of
>>>>>> finalizable objects is very high and/or finalize() methods are not
>>>>>> coded to be fast enough. Suppose the queue of Finalizer(s) builds up
>>>>>> gradually and is already holding several million objects. Now you
>>>>>> initiate the diagnostic command to print the queue. The iteration
>>>>>> over and grouping/counting of several millions of Finalizer(s) takes
>>>>>> some time. This blocks finalizer thread and reference handler thread.
>>>>>> But does not block the threads that allocate finalizable objects. By
>>>>>> the time the iteration is over, the JVM blows up and application
>>>>>> slows down to a halt doing GCs most of the time, getting OOMEs, etc...
>>>>>>
>>>>>> It is possible to iterate the elements of the queue for diagnostic
>>>>>> purposes without holding a lock. The modification required to do that
>>>>>> is the following (havent tested this, but I think it should work):
>>>>>>
>>>>>>    
>>>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.01/
>>>>>>
>>>>> One issue to watch out for is the garbage collectors inspect the
>>>>> Reference.next field from C code directly (to distinguish active vs.
>>>>> pending, iterate over oops, etc.). You can search hotspot for
>>>>> java_lang_ref_Reference::next*, java_lang_ref_Reference::set_next*.
>>>>>
>>>>> Your change makes "inactive" References superficially look like
>>>>> "enqueued" References. The GC code is rather subtle and I haven't yet
>>>>> seen a case where it would get confused by this change, but there
>>>>> could easily be something like that lurking in the GC code.
>>>> ...but then I thought that GC can in no way treat a Reference
>>>> differently whether it is still enqueued in a ReferenceQueue or already
>>>> dequeued (inactive) - responsibility is already on the Java side.
>>>> Currently the definition of Reference.next is this:
>>>>
>>>>     /* When active:   NULL
>>>>      *     pending:   this
>>>>      *    Enqueued:   next reference in queue (or this if last)
>>>>      *    Inactive:   this
>>>>      */
>>>>     @SuppressWarnings("rawtypes")
>>>>     Reference next;
>>>>
>>>> We see that, unless GC inspects all ReferenceQueue instances and scans
>>>> their lists too, the state of a Reference that is enqueued as last in
>>>> list is indistinguishable from the state of inactive Reference. So I
>>>> deduced that this distinction (enqueued/inactive) can't be established
>>>> solely on the value of .next field ( == this or != this)...
>>>>
>>>> But I should inspect the GC code too to build a better understanding of
>>>> that part of the story...
>>>>
>>>> Anyway. It turns out that there is already enough state in Reference to
>>>> destinguish between it being enqueued as last in list and already
>>>> dequeued (inactive) - the additional state is Reference.queue that
>>>> transitions from ReferenceQueue.ENQUEUED -> ReferenceQueue.NULL in
>>>> ReferenceQueue.reallyPoll. We need to just make sure the two fields
>>>> (r.next and r.queue) are assigned and read in correct order. This can be
>>>> achieved either by making Reference.next a volatile field or by a couple
>>>> of explicit fences:
>>>>
>>>>    
>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.02/
>>>>
>>>> The assignment of r.queue to ReferenceQueue.ENQUEUED already happens
>>>> before linking the reference into the queue's head in
>>>> ReferenceQueue.enqueue():
>>>>
>>>>             r.queue = ENQUEUED;
>>>>             r.next = (head == null) ? r : head;
>>>>             head = r;
>>>>
>>>> Both stores are volatile.
>>>>
>>>>>> I also suggest the addition to the ReferenceQueue to be contained
>>>>>> (package-private) and as generic as possible. That's why I suggest
>>>>>> forEach iteration method with no concrete logic.
>>>>>>
>>>>>> It would be possible to encapsulate the entire logic into a special
>>>>>> package-private class (say java.lang.ref.DiagnosticCommands) with
>>>>>> static method(s) (printFinalizationQueue)... You could just expose a
>>>>>> package-private forEach static method from Finalizer and code the
>>>>>> rest in DiagnosticCommands.
>>>>> That's good for encapsulation. But I'm concerned that if "forEach" got
>>>>> exposed beyond careful use in DiagnosticCommands, and the Referents
>>>>> were leaked back into the heap, then we could get unexpected object
>>>>> resurrection after finalization. This isn't a bug on it's own - any
>>>>> finalizer might resurrect its object if not careful, but ordinarily
>>>>> /only/ the finalizer could resurrect the object. This could invalidate
>>>>> application invariants?
>>>> Well, it all stays in the confines of package-private API - internal to
>>>> JDK. Any future additional use should be reviewed carefully. Comments on
>>>> the forEach() method should warn about that.
>>>>
>>>>> I agree that using a lock over the ReferenceQueue iteration opens up
>>>>> /another/ case of diagnostics affecting application behavior. And
>>>>> there are pathological scenarios where this gets severe. This is
>>>>> unfortunate but not uncommon. There is enough complication here that
>>>>> you should be sure that the fix for diagnostics performance doesn't
>>>>> introduce subtle bugs to the system in general. A change in this area
>>>>> should get a thorough review from both the runtime and GC sides.
>>>> Is the webrev.02 proposed above more acceptable in that respect? It does
>>>> not introduce any logical changes to existing code.
>>>>
>>>>> Better yet, the reference handling code in GC and runtime should
>>>>> probably be thrown out and re-written, but that's another issue :-)
>>>> I may have some proposals in that direction. Stay tuned.
>>>>
>>>> Regards, Peter
>>>>
>>>>> - Derek
>>>>>
>>>>>> Regards, Peter
>>>>>>
>>>>>>
>>>>>> On 05/12/2015 07:10 PM, Dmitry Samersoff wrote:
>>>>>>> Everybody,
>>>>>>>
>>>>>>> Updated version:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.03/
>>>>>>>
>>>>>>> Now it iterates over queue and output result sorted by number of 
>>>>>>> instances.
>>>>>>>
>>>>>>> -Dmitry
>>>>>>>
>>>>>>> On 2015-05-07 00:51, Derek White wrote:
>>>>>>>> Hi Dmitry, Staffan,
>>>>>>>>
>>>>>>>> Lots of good comments here.
>>>>>>>>
>>>>>>>> On the topic of what list should be printed out, I think we should 
>>>>>>>> focus
>>>>>>>> on objects waiting to be finalized - e.g. the contents of the
>>>>>>>> ReferenceQueue. It's more of a pain to walk the ReferenceQueue, but you
>>>>>>>> could add a summerizeQueue(TreeMap<String, Integer>) method, or a
>>>>>>>> general iterator and lambda.
>>>>>>>>
>>>>>>>> A histogram of objects with finalize methods might also be interesting,
>>>>>>>> but you can get most of the same information from a heap histogram
>>>>>>>> (ClassHistogramDCmd, and search for count of Finalizer instances).
>>>>>>>>
>>>>>>>> BTW, by only locking the ReferenceQueue, we should only be blocking the
>>>>>>>> FinalizerThread from making progress (and I think there's a GC thread
>>>>>>>> that runs after GC that sorts found References objects that need
>>>>>>>> processing into their respective ReferenceQueues). But locking the
>>>>>>>> "unfinalized" list blocks initializing any object with a finalize() 
>>>>>>>> method.
>>>>>>>>
>>>>>>>> The sorting suggestion is a nice touch.
>>>>>>>>
>>>>>>>>  - Derek White, GC team
>>>>>>>>
>>>>>>>> On 5/5/15 10:40 AM, Peter Levart wrote:
>>>>>>>>> Hi Dmitry, Staffan,
>>>>>>>>>
>>>>>>>>> On 05/05/2015 12:38 PM, Staffan Larsen wrote:
>>>>>>>>>> Dmitry,
>>>>>>>>>>
>>>>>>>>>> I think this should be reviewed on hotspot-gc and core-libs-dev as
>>>>>>>>>> well considering the changes to Finalizer.
>>>>>>>>>>
>>>>>>>>>> I’m a little worried about the potentially very large String that is
>>>>>>>>>> returned from printFinalizationQueue(). A possible different approach
>>>>>>>>>> would be to write printFinalizationQueue() in C++ inside Hotspot.
>>>>>>>>>> That would allow for outputting one line at a time. The output would
>>>>>>>>>> still be saved in memory (since the stream is buffered), but at least
>>>>>>>>>> the data is only saved once in memory, then. It would make the code a
>>>>>>>>>> bit harder to write, so its a question of how scared we are of
>>>>>>>>>> running out of memory.
>>>>>>>>> If the output is just a histogram of # of instances per class name,
>>>>>>>>> then it should not be too large, as there are not many different
>>>>>>>>> classes overriding finalize() method (I counted 72 overriddings of
>>>>>>>>> finalize() method in the whole jdk/src tree).
>>>>>>>>>
>>>>>>>>> I'm more concerned about the fact that while traversing the list, a
>>>>>>>>> lock is held, which might impact prompt finalization(). Is it
>>>>>>>>> acceptable for diagnostic output to impact performance and/or
>>>>>>>>> interfere with synchronization?
>>>>>>>>>
>>>>>>>>> It might be possible to scan the list without holding a lock for
>>>>>>>>> diagnostic purposes if Finalizer.remove() didn't set the removed
>>>>>>>>> Finalizer.next pointer to point back to itself:
>>>>>>>>>
>>>>>>>>>     private void remove() {
>>>>>>>>>         synchronized (lock) {
>>>>>>>>>             if (unfinalized == this) {
>>>>>>>>>                 if (this.next != null) {
>>>>>>>>>                     unfinalized = this.next;
>>>>>>>>>                 } else {
>>>>>>>>>                     unfinalized = this.prev;
>>>>>>>>>                 }
>>>>>>>>>             }
>>>>>>>>>             if (this.next != null) {
>>>>>>>>>                 this.next.prev = this.prev;
>>>>>>>>>             }
>>>>>>>>>             if (this.prev != null) {
>>>>>>>>>                 this.prev.next = this.next;
>>>>>>>>>             }
>>>>>>>>>             // this.next = this; must not be set so that we can
>>>>>>>>> traverse the list unsynchronized
>>>>>>>>>             this.prev = this;   /* Indicates that this has been
>>>>>>>>> finalized */
>>>>>>>>>         }
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>> For detecting whether a Finalizer is already processed, the 'prev'
>>>>>>>>> pointer could be used instead:
>>>>>>>>>
>>>>>>>>>     private boolean hasBeenFinalized() {
>>>>>>>>>         return (prev == this);
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>> Also, to make sure a data race dereferencing 'unfinalized' in
>>>>>>>>> unsynchronized printFinalizationQueue() would get you a fully
>>>>>>>>> initialized Finalizer instance (in particular the next pointer), you
>>>>>>>>> would have to make the 'unfinalized' field volatile or insert an
>>>>>>>>> Unsafe.storeFence() before assigning to 'unfinalized':
>>>>>>>>>
>>>>>>>>>     private void add() {
>>>>>>>>>         synchronized (lock) {
>>>>>>>>>             if (unfinalized != null) {
>>>>>>>>>                 this.next = unfinalized;
>>>>>>>>>                 unfinalized.prev = this;
>>>>>>>>>             }
>>>>>>>>>             // make sure a data race dereferencing 'unfinalized'
>>>>>>>>>             // in printFinalizationQueue() does see the Finalizer
>>>>>>>>>             // instance fully initialized
>>>>>>>>>             // (in particular the 'next' pointer)
>>>>>>>>>             U.storeFence();
>>>>>>>>>             unfinalized = this;
>>>>>>>>>         }
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> By doing these modifications, I think you can remove
>>>>>>>>> synchronized(lock) {} from printFinalizationQueue().
>>>>>>>>>
>>>>>>>>>> I see you are traversing the ‘unfinalized’ list in Finalizer, whereas
>>>>>>>>>> the old SA code for ‘-finalizerinfo' traverses the ‘queue’ list. I am
>>>>>>>>>> not sure of the difference, perhaps someone from the GC team can help
>>>>>>>>>> out?
>>>>>>>>> The 'queue' is a ReferenceQueue of Finalizer (FinalReference)
>>>>>>>>> instances pending processing by finalizer thread because their
>>>>>>>>> referents are eligible for finalization (they are not reachable any
>>>>>>>>> more). The 'unfinalized' is a doubly-linked list of all Finalizer
>>>>>>>>> instances for which their referents have not been finalized yet
>>>>>>>>> (including those that are still reachable and alive). The later serves
>>>>>>>>> two purposes:
>>>>>>>>> - it keeps Finalizer instances reachable until they are processed
>>>>>>>>> - it is a source of unfinalized instances for
>>>>>>>>> running-finalizers-on-exit if requested by
>>>>>>>>> System.runFinalizersOnExit(true);
>>>>>>>>>
>>>>>>>>> So it really depends on what one would like to see. Showing the queue
>>>>>>>>> may be interesting if one wants to see how the finalizer thread is
>>>>>>>>> coping with processing the finalize() invocations. Showing unfinalized
>>>>>>>>> list may be interesting if one wants to know how many live +
>>>>>>>>> finalization pending instances are there on the heap that override
>>>>>>>>> finalize() method.
>>>>>>>>>
>>>>>>>>> Regards, Peter
>>>>>>>>>
>>>>>>>>>> For the output, it would be a nice touch to sort it on the number of
>>>>>>>>>> references for each type. Perhaps outputting it more like a table
>>>>>>>>>> (see the old code again) would also make it easier to digest.
>>>>>>>>>>
>>>>>>>>>> In diagnosticCommand.hpp, the new commands should override
>>>>>>>>>> permission() and limit access:
>>>>>>>>>>
>>>>>>>>>>    static const JavaPermission permission() {
>>>>>>>>>>      JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>>>>>                          "monitor", NULL};
>>>>>>>>>>      return p;
>>>>>>>>>>    }
>>>>>>>>>>
>>>>>>>>>> The two tests don’t validate the output in any way. Would it be
>>>>>>>>>> possible to add validation? Perhaps hard to make sure an object is on
>>>>>>>>>> the finalizer queue… Hmm.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 5 maj 2015, at 12:01, Dmitry Samersoff
>>>>>>>>>>> <dmitry.samers...@oracle.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Everyone,
>>>>>>>>>>>
>>>>>>>>>>> Please review the fix:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> heap dcmd outputs the same information as SIGBREAK
>>>>>>>>>>>
>>>>>>>>>>> finalizerinfo dcmd outputs a list of all classes in finalization 
>>>>>>>>>>> queue
>>>>>>>>>>> with count
>>>>>>>>>>>
>>>>>>>>>>> -Dmitry
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> Dmitry Samersoff
>>>>>>>>>>> Oracle Java development team, Saint Petersburg, Russia
>>>>>>>>>>> * I would love to change the world, but they won't give me the 
>>>>>>>>>>> sources.
>>
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

Re: RFR(M,v6): JDK-8059036 : Implement Diagnostic Commands for heap and finalizerinfo

Reply via email to