Thanks Dan! I will incorporate your suggestions. Regards, Poonam
> -----Original Message----- > From: Daniel D. Daugherty > Sent: Friday, June 16, 2017 10:13 AM > To: Poonam Parhar; Mandy Chung > Cc: serviceability-dev; hotspot-runtime-...@openjdk.java.net > Subject: Re: RFR(10): JDK-8178536: OOM ERRORS + SERVICE-THREAD TAKES A > PROCESSOR TO 100% > > > Hotspot changes: > > > > http://cr.openjdk.java.net/~poonam/8178536/webrev.hotspot/ > > src/share/vm/services/lowMemoryDetector.cpp > nit - please update the copyright year before pushing. > > L307: // call Sensor::trigger(int, MemoryUsage) to send > notification to listeners. > nit - 'call' -> 'Call' > > L308: // when OOME occurs and fails to allocate MemoryUsage > object, call > nit - 'when' -> 'When' > > L328: // we just clear the OOM pending exception that we > might have encountered > nit 'we' -> 'We' > > L328, L330-32 - jcheck is going to complain about the trailing > blanks on > these lines. > > Thumbs up on the HotSpot changes, but these low memory situations are > fraught with unexpected failure modes and code paths. > > Dan > > > On 6/14/17 1:58 PM, Poonam Parhar wrote: > > Hello Mandy, > > > > > > > > Thanks for taking a look at the changes. My comments inlined below: > > > > > > > > From: Mandy Chung > > Sent: Wednesday, June 14, 2017 10:18 AM > > To: Poonam Parhar > > Cc: hotspot-runtime-...@openjdk.java.net; serviceability-dev > > Subject: Re: RFR(10): JDK-8178536: OOM ERRORS + SERVICE-THREAD TAKES > A > > PROCESSOR TO 100% > > > > > > > > > > > > On Jun 7, 2017, at 8:53 AM, Poonam Parhar <HYPERLINK > "mailto:poonam.ba...@oracle.com"poonam.ba...@oracle.com> wrote: > > > > > > > > Hello, > > > > Could I have reviews for the changes for the Bug HYPERLINK > "https://bugs.openjdk.java.net/browse/JDK-8178536"JDK-8178536: OOM > ERRORS + SERVICE-THREAD TAKES A PROCESSOR TO 100%. > > > > Problem: If there are listeners installed for MemoryMXBean then in > the event of an OOME, JVM internal thread 'Service Thread' responsible > for delivering notifications to the Java threads itself may encounter > an OOM exception and get into a loop of processing its pending > requests. This happens if and when the Service Thread executing the > native code calls its corresponding java methods and faces an OOM > exception, this pending exception makes the thread exit early from > SensorInfo∷trigger() function before it can update its pending requests > counter (_pending_trigger_count). This pending exception is never > cleared and that makes the thread loop in > LowMemoryDetector::process_sensor_changes(). > > > > Hotspot changes: > > > > http://cr.openjdk.java.net/~poonam/8178536/webrev.hotspot/ > > > > These changes check for the pending exception and clear it, and make > sure that the pending requests counters are kept in sync on both the > Java and the VM side. > > > > [Poonam] First of all apologies for this weird bold font. Not sure > why my email client sent two emails one with this bold font and one > without. > > > > This fix would drop any notification when it runs out of memory. > This approach is reasonable. Even we preallocate the notification > object, the listener code may fail due to OOM. > > > > > > > > There are two places doing the object allocation - VM allocates > MemoryUsage and the Sensor::trigger implementation. Another idea is to > change Sensor::trigger to take the values of init, used, committed, max > instead MemoryUsage so that the VM code can be simplified and the Java > code will catch all exception and return a boolean to indicate > success/fail. This will also prepare moving the invocation of the > user listener code in a Java thread instead of ServiceThread to address > the issue about ServiceThread is a hidden thread that can’t be > debugged. > > > > > > [Poonam] yes, that could also be another way. But for this bug, I > would like to keep the changes in sync with how the exceptions in the > Java code called from the VM side of the Service Thread are handled. > For example, ServiceThread::service_thread_entry() also calls > GCNotifier::sendNotification() and here’s how the pending exception is > handled. > > > > > > > > > > > > if(has_gc_notification_event) { > > > > GCNotifier::sendNotification(CHECK); > > > > } > > > > void GCNotifier::sendNotification(TRAPS) { > > > > GCNotifier::sendNotificationInternal(THREAD); > > > > // Clearing pending exception to avoid premature termination of > > > > // the service thread > > > > if (HAS_PENDING_EXCEPTION) { > > > > CLEAR_PENDING_EXCEPTION; > > > > } > > > > } > > > > > > > > When we get to the issue of enabling the debugging of the user code > currently being executed on the ServiceThread, we will have to be > looking at these other notifications as well and I would prefer that we > make those changes together under the same bug and not with this > change. > > > > > > > > Please let me know what you think. > > > > > > > > Thanks, > > > > Poonam > > > > > > > > JDK changes: > > > > http://cr.openjdk.java.net/~poonam/8178536/webrev.jdk/ > > > > These changes make the triggerAction() a no-op since we need to call > this method if MemoryService::create_MemoryUsage_obj() encounters an > OOM exception and we want to avoid further potential OOM exceptions in > triggerAction(MemoryUsage). > > > > > > > > Testcase: > > > > I have written a testcase though I won't be integrating it along with > the fix because the failure is unreliable; the Service Thread does not > always fall into the situation that causes it to loop i.e. it has > _pending_trigger_count>0 and hits an OOM in its Java calls. > > > > http://cr.openjdk.java.net/~poonam/8178536/webrev.hotspot.testcase/ > > > > > > > > OK. > > > > > > > > Mandy > > > > > > >