Derek, Personally, I'm for volatile over explicit fence too.
So I'll change the code if Peter don't have objections. -Dmitry On 2015-05-16 01:59, Derek White wrote: > Hi Dmitry, Peter, > > I should read my email more frequently - I missed Dmitry's last webrev > email. > > My comments on a preference for volatile over Unsafe are not a strong > objection to using Unsafe fences. I just don't have enough experience > with Unsafe fences yet to agree that this use is correct. > > While looking over this Reference code I found 3-6 really simple things > that need cleaning up that have been there for years, so I'm not a big > cheerleader for more complicated things here :-) > > - Derek > > On 5/15/15 6:46 PM, Derek White wrote: >> Hi Peter, >> >> Some comments below... >> >> On 5/14/15 3:50 AM, Peter Levart wrote: >>> Hi Derek, >>> >>> On 05/13/2015 10:34 PM, Derek White wrote: >>>> Hi Peter, >>>> >>>> I don't have smoking gun evidence that your change introduces a bug, >>>> but I have some concerns... >>> >>> I did have a concern too, ... >>> >>>> >>>> On 5/12/15 6:05 PM, Peter Levart wrote: >>>>> Hi Dmitry, >>>>> >>>>> You iterate the queue then, not the unfinalized list. That's more >>>>> logical. >>>>> >>>>> Holding the queue's lock may pause reference handler and finalizer >>>>> threads for the entire time of iteration. This can blow up the >>>>> application. Suppose one wants to diagnose the application because >>>>> he suspects that finalizer thread hardly keeps up with production >>>>> of finalizable instances. This can happen if the allocation rate of >>>>> finalizable objects is very high and/or finalize() methods are not >>>>> coded to be fast enough. Suppose the queue of Finalizer(s) builds >>>>> up gradually and is already holding several million objects. Now >>>>> you initiate the diagnostic command to print the queue. The >>>>> iteration over and grouping/counting of several millions of >>>>> Finalizer(s) takes some time. This blocks finalizer thread and >>>>> reference handler thread. But does not block the threads that >>>>> allocate finalizable objects. By the time the iteration is over, >>>>> the JVM blows up and application slows down to a halt doing GCs >>>>> most of the time, getting OOMEs, etc... >>>>> >>>>> It is possible to iterate the elements of the queue for diagnostic >>>>> purposes without holding a lock. The modification required to do >>>>> that is the following (havent tested this, but I think it should work): >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.01/ >>>>> >>>> One issue to watch out for is the garbage collectors inspect the >>>> Reference.next field from C code directly (to distinguish active vs. >>>> pending, iterate over oops, etc.). You can search hotspot for >>>> java_lang_ref_Reference::next*, java_lang_ref_Reference::set_next*. >>>> >>>> Your change makes "inactive" References superficially look like >>>> "enqueued" References. The GC code is rather subtle and I haven't >>>> yet seen a case where it would get confused by this change, but >>>> there could easily be something like that lurking in the GC code. >>> >>> ...but then I thought that GC can in no way treat a Reference >>> differently whether it is still enqueued in a ReferenceQueue or >>> already dequeued (inactive) - responsibility is already on the Java >>> side. Currently the definition of Reference.next is this: >>> >>> /* When active: NULL >>> * pending: this >>> * Enqueued: next reference in queue (or this if last) >>> * Inactive: this >>> */ >>> @SuppressWarnings("rawtypes") >>> Reference next; >>> >>> We see that, unless GC inspects all ReferenceQueue instances and >>> scans their lists too, the state of a Reference that is enqueued as >>> last in list is indistinguishable from the state of inactive >>> Reference. So I deduced that this distinction (enqueued/inactive) >>> can't be established solely on the value of .next field ( == this or >>> != this)... >>> >>> But I should inspect the GC code too to build a better understanding >>> of that part of the story... >>> >>> Anyway. It turns out that there is already enough state in Reference >>> to destinguish between it being enqueued as last in list and already >>> dequeued (inactive) - the additional state is Reference.queue that >>> transitions from ReferenceQueue.ENQUEUED -> ReferenceQueue.NULL in >>> ReferenceQueue.reallyPoll. We need to just make sure the two fields >>> (r.next and r.queue) are assigned and read in correct order. This can >>> be achieved either by making Reference.next a volatile field or by a >>> couple of explicit fences: >>> >>> >>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.02/ >>> >>> The assignment of r.queue to ReferenceQueue.ENQUEUED already happens >>> before linking the reference into the queue's head in >>> ReferenceQueue.enqueue(): >>> >>> r.queue = ENQUEUED; >>> r.next = (head == null) ? r : head; >>> head = r; >>> >>> Both stores are volatile. >> >> This sounds a bit better. I don't have a good handle on the pros and >> cons of a volatile field over explicit fences via Unsafe in this case. >> Kim might jump in soon. >> >> I'd like to suggest an entirely less-clever solution. Since the >> problem is that forEach() might see inconsistent values for the queue >> and next fields of a Reference, but we don't want to grab a lock >> walking the whole queue, we could instead grab the lock as we look at >> each Reference (and do the "back-up" trick if we were interfered >> with). Of course this is slower than going lock-free, but it's not any >> more overhead than the ReferenceHandler thread or FinalizerThreads incur. >> >> The only benefit of this solution over the others is that it is less >> clever, and I'm not sure how much cleverness this problem deserves. >> Given that, and ranking the solutions as "lock" < (clever) "volatile" >> < "fence", I'd be happier with the "volatile" or "lock" solution if >> that is sufficient. >> >> - Derek >>>> >>>>> I also suggest the addition to the ReferenceQueue to be contained >>>>> (package-private) and as generic as possible. That's why I suggest >>>>> forEach iteration method with no concrete logic. >>>>> >>>>> It would be possible to encapsulate the entire logic into a special >>>>> package-private class (say java.lang.ref.DiagnosticCommands) with >>>>> static method(s) (printFinalizationQueue)... You could just expose >>>>> a package-private forEach static method from Finalizer and code the >>>>> rest in DiagnosticCommands. >>>> That's good for encapsulation. But I'm concerned that if "forEach" >>>> got exposed beyond careful use in DiagnosticCommands, and the >>>> Referents were leaked back into the heap, then we could get >>>> unexpected object resurrection after finalization. This isn't a bug >>>> on it's own - any finalizer might resurrect its object if not >>>> careful, but ordinarily /only/ the finalizer could resurrect the >>>> object. This could invalidate application invariants? >>> >>> Well, it all stays in the confines of package-private API - internal >>> to JDK. Any future additional use should be reviewed carefully. >>> Comments on the forEach() method should warn about that. >>> >>>> >>>> I agree that using a lock over the ReferenceQueue iteration opens up >>>> /another/ case of diagnostics affecting application behavior. And >>>> there are pathological scenarios where this gets severe. This is >>>> unfortunate but not uncommon. There is enough complication here that >>>> you should be sure that the fix for diagnostics performance doesn't >>>> introduce subtle bugs to the system in general. A change in this >>>> area should get a thorough review from both the runtime and GC sides. >>> >>> Is the webrev.02 proposed above more acceptable in that respect? It >>> does not introduce any logical changes to existing code. >>> >>>> >>>> Better yet, the reference handling code in GC and runtime should >>>> probably be thrown out and re-written, but that's another issue :-) >>> >>> I may have some proposals in that direction. Stay tuned. >>> >>> Regards, Peter >>> >>>> >>>> - Derek >>>> >>>>> Regards, Peter >>>>> >>>>> >>>>> On 05/12/2015 07:10 PM, Dmitry Samersoff wrote: >>>>>> Everybody, >>>>>> >>>>>> Updated version: >>>>>> >>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.03/ >>>>>> >>>>>> Now it iterates over queue and output result sorted by number of >>>>>> instances. >>>>>> >>>>>> -Dmitry >>>>>> >>>>>> On 2015-05-07 00:51, Derek White wrote: >>>>>>> Hi Dmitry, Staffan, >>>>>>> >>>>>>> Lots of good comments here. >>>>>>> >>>>>>> On the topic of what list should be printed out, I think we should focus >>>>>>> on objects waiting to be finalized - e.g. the contents of the >>>>>>> ReferenceQueue. It's more of a pain to walk the ReferenceQueue, but you >>>>>>> could add a summerizeQueue(TreeMap<String, Integer>) method, or a >>>>>>> general iterator and lambda. >>>>>>> >>>>>>> A histogram of objects with finalize methods might also be interesting, >>>>>>> but you can get most of the same information from a heap histogram >>>>>>> (ClassHistogramDCmd, and search for count of Finalizer instances). >>>>>>> >>>>>>> BTW, by only locking the ReferenceQueue, we should only be blocking the >>>>>>> FinalizerThread from making progress (and I think there's a GC thread >>>>>>> that runs after GC that sorts found References objects that need >>>>>>> processing into their respective ReferenceQueues). But locking the >>>>>>> "unfinalized" list blocks initializing any object with a finalize() >>>>>>> method. >>>>>>> >>>>>>> The sorting suggestion is a nice touch. >>>>>>> >>>>>>> - Derek White, GC team >>>>>>> >>>>>>> On 5/5/15 10:40 AM, Peter Levart wrote: >>>>>>>> Hi Dmitry, Staffan, >>>>>>>> >>>>>>>> On 05/05/2015 12:38 PM, Staffan Larsen wrote: >>>>>>>>> Dmitry, >>>>>>>>> >>>>>>>>> I think this should be reviewed on hotspot-gc and core-libs-dev as >>>>>>>>> well considering the changes to Finalizer. >>>>>>>>> >>>>>>>>> I’m a little worried about the potentially very large String that is >>>>>>>>> returned from printFinalizationQueue(). A possible different approach >>>>>>>>> would be to write printFinalizationQueue() in C++ inside Hotspot. >>>>>>>>> That would allow for outputting one line at a time. The output would >>>>>>>>> still be saved in memory (since the stream is buffered), but at least >>>>>>>>> the data is only saved once in memory, then. It would make the code a >>>>>>>>> bit harder to write, so its a question of how scared we are of >>>>>>>>> running out of memory. >>>>>>>> If the output is just a histogram of # of instances per class name, >>>>>>>> then it should not be too large, as there are not many different >>>>>>>> classes overriding finalize() method (I counted 72 overriddings of >>>>>>>> finalize() method in the whole jdk/src tree). >>>>>>>> >>>>>>>> I'm more concerned about the fact that while traversing the list, a >>>>>>>> lock is held, which might impact prompt finalization(). Is it >>>>>>>> acceptable for diagnostic output to impact performance and/or >>>>>>>> interfere with synchronization? >>>>>>>> >>>>>>>> It might be possible to scan the list without holding a lock for >>>>>>>> diagnostic purposes if Finalizer.remove() didn't set the removed >>>>>>>> Finalizer.next pointer to point back to itself: >>>>>>>> >>>>>>>> private void remove() { >>>>>>>> synchronized (lock) { >>>>>>>> if (unfinalized == this) { >>>>>>>> if (this.next != null) { >>>>>>>> unfinalized = this.next; >>>>>>>> } else { >>>>>>>> unfinalized = this.prev; >>>>>>>> } >>>>>>>> } >>>>>>>> if (this.next != null) { >>>>>>>> this.next.prev = this.prev; >>>>>>>> } >>>>>>>> if (this.prev != null) { >>>>>>>> this.prev.next = this.next; >>>>>>>> } >>>>>>>> // this.next = this; must not be set so that we can >>>>>>>> traverse the list unsynchronized >>>>>>>> this.prev = this; /* Indicates that this has been >>>>>>>> finalized */ >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> For detecting whether a Finalizer is already processed, the 'prev' >>>>>>>> pointer could be used instead: >>>>>>>> >>>>>>>> private boolean hasBeenFinalized() { >>>>>>>> return (prev == this); >>>>>>>> } >>>>>>>> >>>>>>>> Also, to make sure a data race dereferencing 'unfinalized' in >>>>>>>> unsynchronized printFinalizationQueue() would get you a fully >>>>>>>> initialized Finalizer instance (in particular the next pointer), you >>>>>>>> would have to make the 'unfinalized' field volatile or insert an >>>>>>>> Unsafe.storeFence() before assigning to 'unfinalized': >>>>>>>> >>>>>>>> private void add() { >>>>>>>> synchronized (lock) { >>>>>>>> if (unfinalized != null) { >>>>>>>> this.next = unfinalized; >>>>>>>> unfinalized.prev = this; >>>>>>>> } >>>>>>>> // make sure a data race dereferencing 'unfinalized' >>>>>>>> // in printFinalizationQueue() does see the Finalizer >>>>>>>> // instance fully initialized >>>>>>>> // (in particular the 'next' pointer) >>>>>>>> U.storeFence(); >>>>>>>> unfinalized = this; >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> By doing these modifications, I think you can remove >>>>>>>> synchronized(lock) {} from printFinalizationQueue(). >>>>>>>> >>>>>>>>> I see you are traversing the ‘unfinalized’ list in Finalizer, whereas >>>>>>>>> the old SA code for ‘-finalizerinfo' traverses the ‘queue’ list. I am >>>>>>>>> not sure of the difference, perhaps someone from the GC team can help >>>>>>>>> out? >>>>>>>> The 'queue' is a ReferenceQueue of Finalizer (FinalReference) >>>>>>>> instances pending processing by finalizer thread because their >>>>>>>> referents are eligible for finalization (they are not reachable any >>>>>>>> more). The 'unfinalized' is a doubly-linked list of all Finalizer >>>>>>>> instances for which their referents have not been finalized yet >>>>>>>> (including those that are still reachable and alive). The later serves >>>>>>>> two purposes: >>>>>>>> - it keeps Finalizer instances reachable until they are processed >>>>>>>> - it is a source of unfinalized instances for >>>>>>>> running-finalizers-on-exit if requested by >>>>>>>> System.runFinalizersOnExit(true); >>>>>>>> >>>>>>>> So it really depends on what one would like to see. Showing the queue >>>>>>>> may be interesting if one wants to see how the finalizer thread is >>>>>>>> coping with processing the finalize() invocations. Showing unfinalized >>>>>>>> list may be interesting if one wants to know how many live + >>>>>>>> finalization pending instances are there on the heap that override >>>>>>>> finalize() method. >>>>>>>> >>>>>>>> Regards, Peter >>>>>>>> >>>>>>>>> For the output, it would be a nice touch to sort it on the number of >>>>>>>>> references for each type. Perhaps outputting it more like a table >>>>>>>>> (see the old code again) would also make it easier to digest. >>>>>>>>> >>>>>>>>> In diagnosticCommand.hpp, the new commands should override >>>>>>>>> permission() and limit access: >>>>>>>>> >>>>>>>>> static const JavaPermission permission() { >>>>>>>>> JavaPermission p = {"java.lang.management.ManagementPermission", >>>>>>>>> "monitor", NULL}; >>>>>>>>> return p; >>>>>>>>> } >>>>>>>>> >>>>>>>>> The two tests don’t validate the output in any way. Would it be >>>>>>>>> possible to add validation? Perhaps hard to make sure an object is on >>>>>>>>> the finalizer queue… Hmm. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> /Staffan >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 5 maj 2015, at 12:01, Dmitry Samersoff >>>>>>>>>> <dmitry.samers...@oracle.com> wrote: >>>>>>>>>> >>>>>>>>>> Everyone, >>>>>>>>>> >>>>>>>>>> Please review the fix: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.01/ >>>>>>>>>> >>>>>>>>>> heap dcmd outputs the same information as SIGBREAK >>>>>>>>>> >>>>>>>>>> finalizerinfo dcmd outputs a list of all classes in finalization >>>>>>>>>> queue >>>>>>>>>> with count >>>>>>>>>> >>>>>>>>>> -Dmitry >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Dmitry Samersoff >>>>>>>>>> Oracle Java development team, Saint Petersburg, Russia >>>>>>>>>> * I would love to change the world, but they won't give me the >>>>>>>>>> sources. >>>>> >>>> >>> >> > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources.