On 17/01/2014 4:48 AM, srikalyan wrote:
Hi David
On 1/15/14, 9:04 PM, David Holmes wrote:
On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:
Hi Peter/David, we could finally get a trace of exception with fastdebug
build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
<https://bugs.openjdk.java.net/browse/JDK-8022321> as attachments.
All I can see is the log for the OOMECatchingTest program not one for
the actual ReferenceHandler ??
Please search for ReferenceHandler in the log.
Observations from the log:
Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141 Reference<Object> r;
142 synchronized (lock) {
143 if (pending != null) {
144 r = pending;
145 pending = r.discovered;
146 r.discovered = null;
pending field in Reference is touched and updated by the collector, so
at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by collector
which causes ReferenceHandler thread to die.
Sorry but the GC does not trigger asynchronous exceptions so this
explanation does not make any sense to me. What part of the log led
you to this conclusion?
------------------ Log Excerpt begins ------------------
Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,
line 168]
for thread 0x00007feed80cf800
Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
thrown in interpreter method <{method} {0x00007feeddd3c600} 'runImpl'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'>
at bci 65 for thread 0x00007feed80cf800
Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
thrown in interpreter method <{method} {0x00007feeddd3c478} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'>
at bci 1 for thread 0x00007feed80cf800
Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,
line 157]
for thread 0x00007feed80cf800
Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
thrown in interpreter method <{method} {0x00007feeddcaaf90}
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '>
at bci 48 for thread 0x00007feed80cf800
Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
thrown in interpreter method <{method} {0x00007feeddca7298}
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/>
at bci 6 for thread 0x00007feed80cf800
------------------ Log Excerpt ends ------------------
Sorry if it is a wrong understanding.
What you are seeing there is an OOME escaping the run() method which
will cause the uncaughtExceptionHandler to be run which then triggers a
second OOME (likely as it tries to report information about the first
OOME). The first exception occurred in runImpl at BCI 65. Can you
disassemble (javap -c) the class you used so we can see what is at BCI 65.
Thanks,
David
Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) in the
ReferenceHandler will fix the issue, if ReferenceHandler is considered
as part of the GC sub system then it should be alive even in the midst
of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and call a
private runImpl() from run() in ReferenceHandler makes sense to me.
Why would we need this?
David
-----
---
Thanks
kalyan
On 01/13/2014 03:57 PM, srikalyan wrote:
On 1/11/14, 6:15 AM, Peter Levart wrote:
On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
Hi Peter the version you provided ran indefinitely(i put a 10 minute
timeout) and the program got interrupted(no error),
Did you run it with or without fastedbug & -XX:+TraceExceptions ? If
with, it might be that fastdebug and/or -XX:+TraceExceptions changes
the execution a bit so that we can no longer reproduce the wrong
behaviour.
With fastdebug & -XX:TraceExceptions. I will try combination of
possible options(i.e without -XX:TraceEception on debug build etc)
soon.
even if there were to be an error you cannot print the "string" of
thread to console(these have been attempted earlier).
...it has been attempted to print toString in uncaught exception
handler. At that time, the heap is still full. I'm printing it after
the GC has cleared the heap. You can try that it works by commenting
out the "try {" and corresponding "} catch (OOME x) {}" exception
handler...
Since there is a GC call prior to printing string i will give that a
shot with non-debug build.
- The test's running on interpreter mode, what i am watching for is
one error with trace. Without fastdebug build and
-XX:+TraceExceptions i am able to reproduce failure atleast 5
failures out of 1000 runs but with fastdebug+Trace no luck
yet(already past few 1000 runs).
It might be interesting to try with fastebug build but without the
-XX:+TraceExceptions option to see what has an effect on it. It might
also be interesting to try the modified ReferenceHandler (the one
with private runImpl() method called from run()) and with normal
non-fastdebug JDK. This info might be useful when one starts to
inspect the exception handling code in interpreter...
Regards, Peter
--
Thanks
kalyan
Ph: (408)-585-8040
---
Thanks
kalyan
On 01/10/2014 02:57 AM, Peter Levart wrote:
On 01/10/2014 09:31 AM, Peter Levart wrote:
Since we suspect there's something wrong with exception handling
in interpreter, I devised a hypothetical reproducer that tries to
simulate ReferenceHandler in many aspects, but doesn't require to
be a ReferenceHandler:
http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java
This is designed to run indefinitely and only terminate if/when
thread dies. Could you run this program in the environment that
causes the OOMEInReferenceHandler test to fail and see if it
terminates?
I forgot to mention that in order for this long-running program to
exhibit interpreter behaviour, it should be run with -Xint option.
So I suggest:
-Xmx24M -XX:-UseTLAB -Xint
Regards, Peter