Hi Stuart
I did an experiment that set a small thread stack size using the -Xss228k or -Xss512k. The result is surprised that jtreg reports the test passed. Although I can see the StackOverflowError showing in the log even when I set thread stack size as 512k . So the problem is old shell script doesn't report the error even there is a StackOverflowError.
Thank you.
Tristan

On 12/21/2013 08:01 AM, Stuart Marks wrote:
On 12/19/13 8:29 PM, David Holmes wrote:
If you were always one frame from the end then it is not so surprising that a simple change pushes you past the limit :) Try running the shell test with
additional recursive loads and see when it fails.

David doesn't seem surprised, but I guess I still am. :-)

Tristan, do you think you could do some investigation here, regarding the shell script based test's stack consumption? Run the shell-based test with some different values for -Xss and see how low you have to set it before it generates a stack overflow.

It's also kind of strange that in the two stack traces I've seen (I
think I managed to capture only one in the bug report though) the
StackOverflowError occurs on loading exactly the 50th class. Since we're
observing intermittent behavior (happens sometimes but not others) the
stack size is apparently variable. Since it's variable I'd expect to see
it failing at different times, possibly the 49th or 48th recursive
classload, not just the 50th. And in such circumstances, do we know what
the default stack size is?

Classloading consumes a reasonable chunk of stack so if the variance elsewhere is quite small it is not that surprising that the test always fails on the 50th class. I would not expect run-to-run stack usage variance to be high unless
there is some random component to the test.

Hm. There should be no variance in stack usage coming from the test itself. I believe the test does the same thing every time.

The thing I'm concerned about is whether the Java-based test is doing something different from the shell-based test, because of the execution environment (jtreg or other). We may end up simply raising the stack limit anyway, but I still find it hard to believe that the shell-based test was consistently just a few frames shy of a stack overflow.

The failure is intermittent; we've seen it twice in JPRT (our internal build&test system). Possible sources of the intermittency are from the different machines on which the test executes. So environmental factors could be at play. How does the JVM determine the default stack size? Could different test runs on different machines be running with different stack sizes?

Another source of variance is the JIT. I believe JIT-compiled code consumes stack differently from interpreted code. At least, I've seen differences in stack usage between -Xint and -Xcomp runs, and in the absence of these options (which means -Xmixed, I guess) the results sometimes vary unpredictably. I guess this might have to do with when the JIT compiler decides to kick in.

This test does perform a bunch of iterations, so JIT compilation could be a factor.

I don't know if you were able to reproduce this issue. If you were, it
would be good to understand in more detail exactly what's going on.

FWIW there was a recent change in 7u to bump up the number of stack shadow pages in hotspot as "suddenly" StackOverflow tests were crashing instead of triggering StackOverflowError. So something started using more stack in a way the caused there to not be enough space to process a stackoverflow properly. Finding the
exact cause can be somewhat tedious.

This seems like a different problem. We're seeing actual StackOverflowErrors, not crashes. Good to look out for this, though.

s'marks



Cheers,
David

s'marks

Reply via email to