> Hotspot changes:
>
> http://cr.openjdk.java.net/~poonam/8178536/webrev.hotspot/
src/share/vm/services/lowMemoryDetector.cpp
nit - please update the copyright year before pushing.
L307: // call Sensor::trigger(int, MemoryUsage) to send
notification to listeners.
nit - 'call' -> 'Call'
L308: // when OOME occurs and fails to allocate MemoryUsage
object, call
nit - 'when' -> 'When'
L328: // we just clear the OOM pending exception that we
might have encountered
nit 'we' -> 'We'
L328, L330-32 - jcheck is going to complain about the trailing
blanks on
these lines.
Thumbs up on the HotSpot changes, but these low memory situations
are fraught with unexpected failure modes and code paths.
Dan
On 6/14/17 1:58 PM, Poonam Parhar wrote:
Hello Mandy,
Thanks for taking a look at the changes. My comments inlined below:
From: Mandy Chung
Sent: Wednesday, June 14, 2017 10:18 AM
To: Poonam Parhar
Cc: hotspot-runtime-...@openjdk.java.net; serviceability-dev
Subject: Re: RFR(10): JDK-8178536: OOM ERRORS + SERVICE-THREAD TAKES A
PROCESSOR TO 100%
On Jun 7, 2017, at 8:53 AM, Poonam Parhar <HYPERLINK
"mailto:poonam.ba...@oracle.com"poonam.ba...@oracle.com> wrote:
Hello,
Could I have reviews for the changes for the Bug HYPERLINK
"https://bugs.openjdk.java.net/browse/JDK-8178536"JDK-8178536: OOM ERRORS +
SERVICE-THREAD TAKES A PROCESSOR TO 100%.
Problem: If there are listeners installed for MemoryMXBean then in the event of
an OOME, JVM internal thread 'Service Thread' responsible for delivering
notifications to the Java threads itself may encounter an OOM exception and get
into a loop of processing its pending requests. This happens if and when the
Service Thread executing the native code calls its corresponding java methods
and faces an OOM exception, this pending exception makes the thread exit early
from SensorInfo∷trigger() function before it can update its pending requests
counter (_pending_trigger_count). This pending exception is never cleared and
that makes the thread loop in LowMemoryDetector::process_sensor_changes().
Hotspot changes:
http://cr.openjdk.java.net/~poonam/8178536/webrev.hotspot/
These changes check for the pending exception and clear it, and make sure that
the pending requests counters are kept in sync on both the Java and the VM side.
[Poonam] First of all apologies for this weird bold font. Not sure why my email
client sent two emails one with this bold font and one without.
This fix would drop any notification when it runs out of memory. This approach
is reasonable. Even we preallocate the notification object, the listener code
may fail due to OOM.
There are two places doing the object allocation - VM allocates MemoryUsage and
the Sensor::trigger implementation. Another idea is to change Sensor::trigger
to take the values of init, used, committed, max instead MemoryUsage so that
the VM code can be simplified and the Java code will catch all exception and
return a boolean to indicate success/fail. This will also prepare moving the
invocation of the user listener code in a Java thread instead of ServiceThread
to address the issue about ServiceThread is a hidden thread that can’t be
debugged.
[Poonam] yes, that could also be another way. But for this bug, I would like to
keep the changes in sync with how the exceptions in the Java code called from
the VM side of the Service Thread are handled. For example,
ServiceThread::service_thread_entry() also calls GCNotifier::sendNotification()
and here’s how the pending exception is handled.
if(has_gc_notification_event) {
GCNotifier::sendNotification(CHECK);
}
void GCNotifier::sendNotification(TRAPS) {
GCNotifier::sendNotificationInternal(THREAD);
// Clearing pending exception to avoid premature termination of
// the service thread
if (HAS_PENDING_EXCEPTION) {
CLEAR_PENDING_EXCEPTION;
}
}
When we get to the issue of enabling the debugging of the user code currently
being executed on the ServiceThread, we will have to be looking at these other
notifications as well and I would prefer that we make those changes together
under the same bug and not with this change.
Please let me know what you think.
Thanks,
Poonam
JDK changes:
http://cr.openjdk.java.net/~poonam/8178536/webrev.jdk/
These changes make the triggerAction() a no-op since we need to call this
method if MemoryService::create_MemoryUsage_obj() encounters an OOM exception
and we want to avoid further potential OOM exceptions in
triggerAction(MemoryUsage).
Testcase:
I have written a testcase though I won't be integrating it along with the fix
because the failure is unreliable; the Service Thread does not always fall into
the situation that causes it to loop i.e. it has _pending_trigger_count>0 and
hits an OOM in its Java calls.
http://cr.openjdk.java.net/~poonam/8178536/webrev.hotspot.testcase/
OK.
Mandy