Re: RFR for JDK-8030284 TEST_BUG: intermittent StackOverflow in RMI bench/serial test

Tristan Yan Mon, 23 Dec 2013 22:00:14 -0800

Hi Stuart

I did an experiment that set a small thread stack size using the-Xss228k or -Xss512k. The result is surprised that jtreg reports thetest passed. Although I can see the StackOverflowError showing in thelog even when I set thread stack size as 512k . So the problem is oldshell script doesn't report the error even there is a StackOverflowError.

Thank you.
Tristan


On 12/21/2013 08:01 AM, Stuart Marks wrote:

On 12/19/13 8:29 PM, David Holmes wrote:
If you were always one frame from the end then it is not sosurprising that asimple change pushes you past the limit :) Try running the shell testwith
additional recursive loads and see when it fails.
David doesn't seem surprised, but I guess I still am. :-)
Tristan, do you think you could do some investigation here, regardingthe shell script based test's stack consumption? Run the shell-basedtest with some different values for -Xss and see how low you have toset it before it generates a stack overflow.
It's also kind of strange that in the two stack traces I've seen (I
think I managed to capture only one in the bug report though) the
StackOverflowError occurs on loading exactly the 50th class. Sincewe're
observing intermittent behavior (happens sometimes but not others) the
stack size is apparently variable. Since it's variable I'd expect tosee
it failing at different times, possibly the 49th or 48th recursive
classload, not just the 50th. And in such circumstances, do we knowwhat
the default stack size is?
Classloading consumes a reasonable chunk of stack so if the varianceelsewhereis quite small it is not that surprising that the test always failson the 50thclass. I would not expect run-to-run stack usage variance to be highunless
there is some random component to the test.
Hm. There should be no variance in stack usage coming from the testitself. I believe the test does the same thing every time.
The thing I'm concerned about is whether the Java-based test is doingsomething different from the shell-based test, because of theexecution environment (jtreg or other). We may end up simply raisingthe stack limit anyway, but I still find it hard to believe that theshell-based test was consistently just a few frames shy of a stackoverflow.
The failure is intermittent; we've seen it twice in JPRT (our internalbuild&test system). Possible sources of the intermittency are from thedifferent machines on which the test executes. So environmentalfactors could be at play. How does the JVM determine the default stacksize? Could different test runs on different machines be running withdifferent stack sizes?
Another source of variance is the JIT. I believe JIT-compiled codeconsumes stack differently from interpreted code. At least, I've seendifferences in stack usage between -Xint and -Xcomp runs, and in theabsence of these options (which means -Xmixed, I guess) the resultssometimes vary unpredictably. I guess this might have to do with whenthe JIT compiler decides to kick in.
This test does perform a bunch of iterations, so JIT compilation couldbe a factor.
I don't know if you were able to reproduce this issue. If you were, it
would be good to understand in more detail exactly what's going on.
FWIW there was a recent change in 7u to bump up the number of stackshadow pagesin hotspot as "suddenly" StackOverflow tests were crashing instead oftriggeringStackOverflowError. So something started using more stack in a waythe causedthere to not be enough space to process a stackoverflow properly.Finding the
exact cause can be somewhat tedious.
This seems like a different problem. We're seeing actualStackOverflowErrors, not crashes. Good to look out for this, though.
s'marks
Cheers,
David
s'marks

Re: RFR for JDK-8030284 TEST_BUG: intermittent StackOverflow in RMI bench/serial test

Reply via email to