On 22 September 2011 13:04, Christopher Dolan <[email protected]> wrote: > It's quite reproducible, but only in integration not in isolation (yet). It > was discovered as a hang in QA, then after several reproductions, someone > attached a debugger and found that surprising exception. > > We've discovered that switching from Sun Java 1.6 back to 1.5 makes it go > away. > > One noteworthy fact I've since learned (and it's obvious from the stacktrace > in retrospect): this is a C++ thread and the root of the Java stack comes > from C++ via JNI. In theory that shouldn't matter, right? But that's sure has > a suspicious smell to me.
Maybe but all the code you're executing is Java code and all the variables it's referencing are Java variables. For that to be messed up by JNI requires some memory corruption or similar and I can't help but think if that were going on you'd have a lot of other odd behaviours creeping out of the woodwork. The lock is notionally being asserted and the wait is being taken on the same reference - if the reference were corrupt I reckon the synchronized block would likely fail when asserting the lock. That doesn't seem to happen which implies that between the synch and the wait the reference is changing which implies gc could be a culprit or that the JIT is re-ordering instructions wrongly or has in some way conspired to mix up references.
