On Wed, 25 Oct 2023 21:08:01 GMT, Leonid Mesnik <lmes...@openjdk.org> wrote:
> The jtreg starts the main thread in a separate ThreadGroup and checks > unhandled exceptions for this group. However, it doesn't catch all unhandled > exceptions. There is a jtreg issue for this > https://bugs.openjdk.org/browse/CODETOOLS-7903526. > Catching such issues for virtual threads is important because they are not > included in any groups. So this fix implements the handler for the test > thread factory. > > A few tests start failing. > > The test > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorVMEventsTest.java > has testcases for platform and virtual threads. So, there is there's no need > to run it with the thread factory. > > The test > java/lang/Thread/virtual/ThreadAPI.java > tests UncaughtExceptionHandler and virtual threads. No need to run it with a > thread factory. > > Test > test/jdk/java/util/concurrent/tck/ThreadTest.java is updated to not check the > default empty handler. > > Probably, we need some common approach about dealing with the > UncaughtExceptionHandler in jtreg. Hello Leonid, in order to understand what exactly we are trying to solve here, I ran a few tests to see how things work without the changes being proposed in this PR. Here's my findings. A bit of background first. When jtreg runs, either in agent mode or othervm mode, it creates a specific thread within which the actual test code runs. In either of these modes, it uses a custom jtreg specific `ThreadGroup` instance for this thread which is running the test code. This instance of `ThreadGroup` has specific overridden implementation of the `public void uncaughtException(Thread t, Throwable e)` API which keeps track of uncaught exception that might have been thrown by any threads that could have been spawned by the test. After `start()`ing the thread which runs the test code, the jtreg framework then waits for that thread to complete and once completed (either exceptionally or normally), jtreg framework then queries a state on the custom `ThreadGroup` instance to see if any uncaught exception(s) were received during the lifetime of this thread which ran that test. If it finds any, then it marks the test as failed and reports such a failure appropriately in the test repo rt. As noted, this applies for both the agent mode and other vm mode. Some important aspects of this implementation is that: - The custom `ThreadGroup` which has the overridden implementation of the `uncaughtException(Thread t, Throwable e)` method doesn't interfere with the JVM level default exception handler. - After the thread which ran the test code completes, the decision on whether to fail or pass a test is taken by checking the custom `ThreadGroup`'s state. Once this decision is done, the decision is immediately reported in relevant ways and the test status is marked (i.e. finalized) at this point. - If this was running in agent vm mode, the agent vm mode continues to operate and isn't terminated and thus is available for subsequent tests. This point I think is important to remember for reasons that will be noted later in this comment. Now coming to the part where in https://bugs.openjdk.org/browse/JDK-8303703 we introduced a way where jtreg instead of creating a platform thread (backed by its custom implementation of the `ThreadGroup`) to run the test code, now checks the presence of a `ThreadFactory` and if present, lets it create a `Thread` which runs this test code. In the JDK repo, we have an implementation of a `ThreadFactory` which creates a virtual thread instead of platform thread. Instances of virtual threads do belong to a `ThreadGroup`. But that instance of `ThreadGroup` cannot be controlled by user code and thus it cannot have a custom overridden implementation of `uncaughtException(Thread t, Throwable e)`. Any `Thread` created within the test code will have its uncaught exception handled by the virtual thread's `ThreadGroup` whose internal implementation just delegates to the "system" `ThreadGroup` which just writes it out to the `System.err` (of course, I am not considering `Thread`s which have been set with a specific uncaught excepiton handler). Effectively, what this means is when jtreg runs a test using a platform thread and if that test creates a `Thread` which throws an uncaught exception, then those tests will be marked and reported as failed. If however, jtreg runs a test using a virtual thread, then the same test will now be marked as passed. Clearly this is not a good thing. For reference, here's the test I used to verify this behaviour: import java.util.concurrent.atomic.AtomicBoolean; /* * @test * @run main FooTest */ public class FooTest { public static void main(final String[] args) throws Exception { final Thread otherThread = new Thread(new AlwaysThrows()); otherThread.setName("foo-bar"); System.out.println("Starting " + otherThread.getName() + " thread"); otherThread.start(); try { otherThread.join(); // wait for it } catch (InterruptedException ie) { System.out.println(Thread.currentThread().getName() + " was interrupted"); } // verify that the other thread was invoked if (!AlwaysThrows.invoked.get()) { throw new AssertionError(otherThread.getName() + " wasn't run"); } System.out.println("Test execution done"); } private static class AlwaysThrows implements Runnable { private static AtomicBoolean invoked = new AtomicBoolean(); @Override public void run() { invoked.set(true); System.out.println("Throwing an exception from " + Thread.currentThread().getName()); throw new RuntimeException("Intentionally thrown exception"); } } } Running this with jtreg in agentvm mode jtreg -report:files -verbose:summary -ea -esa -agentvm -jdk:build/macosx-aarch64/images/jdk test/jdk/java/lang/FooTest.java .... #section:main ----------messages:(7/204)---------- command: main FooTest reason: User specified action: run main FooTest started: Sun Oct 29 12:40:14 IST 2023 Mode: agentvm Agent id: 2 finished: Sun Oct 29 12:40:14 IST 2023 elapsed time (seconds): 0.191 ... ----------System.out:(4/109)---------- Starting foo-bar thread Throwing an exception from foo-bar AgentVMThread was interrupted Test execution done ----------System.err:(3/35)---------- JavaTest Message: Test complete. result: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Intentionally thrown exception test result: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Intentionally thrown exception Running in othervm mode: jtreg -report:files -verbose:summary -ea -esa -othervm -jdk:build/macosx-aarch64/images/jdk test/jdk/java/lang/FooTest.java ... #section:main ----------messages:(6/192)---------- command: main FooTest reason: User specified action: run main FooTest started: Sun Oct 29 12:41:22 IST 2023 Mode: othervm finished: Sun Oct 29 12:41:22 IST 2023 elapsed time (seconds): 0.073 ----------configuration:(0/0)---------- ----------System.out:(2/59)---------- Starting foo-bar thread Throwing an exception from foo-bar ----------System.err:(4/255)---------- java.lang.RuntimeException: Intentionally thrown exception at FooTest$AlwaysThrows.run(FooTest.java:32) at java.base/java.lang.Thread.run(Thread.java:1570) STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Intentionally thrown exception So in both agentvm and othervm mode it gets marked as failed. Now running the same using the virtual thread feature of jtreg: jtreg -report:files -verbose:summary -ea -esa -agentvm -jdk:build/macosx-aarch64/images/jdk -testThreadFactoryPath:.../build/macosx-aarch64/support/test/jtreg_test_thread_factory/jtregTestThreadFactory.jar -testThreadFactory:Virtual test/jdk/java/lang/FooTest.java ... #section:main ----------messages:(7/204)---------- command: main FooTest reason: User specified action: run main FooTest started: Sun Oct 29 19:12:33 IST 2023 Mode: agentvm Agent id: 1 finished: Sun Oct 29 19:12:34 IST 2023 elapsed time (seconds): 0.211 ----------System.out:(3/79)---------- Starting foo-bar thread Throwing an exception from foo-bar Test execution done ----------System.err:(6/223)---------- Exception in thread "foo-bar" java.lang.RuntimeException: Intentionally thrown exception at FooTest$AlwaysThrows.run(FooTest.java:32) at java.base/java.lang.Thread.run(Thread.java:1570) JavaTest Message: Test complete. result: Passed. Execution successful test result: Passed. Execution successful jtreg -report:files -verbose:summary -ea -esa -othervm -jdk:build/macosx-aarch64/images/jdk -testThreadFactoryPath:.../build/macosx-aarch64/support/test/jtreg_test_thread_factory/jtregTestThreadFactory.jar -testThreadFactory:Virtual test/jdk/java/lang/FooTest.java ... #section:main ----------messages:(6/191)---------- command: main FooTest reason: User specified action: run main FooTest started: Sun Oct 29 19:13:22 IST 2023 Mode: othervm finished: Sun Oct 29 19:13:22 IST 2023 elapsed time (seconds): 0.07 ----------configuration:(0/0)---------- ----------System.out:(3/79)---------- Starting foo-bar thread Throwing an exception from foo-bar Test execution done ----------System.err:(4/203)---------- Exception in thread "foo-bar" java.lang.RuntimeException: Intentionally thrown exception at FooTest$AlwaysThrows.run(FooTest.java:32) at java.base/java.lang.Thread.run(Thread.java:1570) STATUS:Passed. In both agentvm and othervm mode it is now marked as passed, which clearly is a change in behaviour and not a good thing. So coming to the proposed solution in this PR. It proposes to do a `System.exit(1)` on a uncaught exception being thrown from within `Thread` running in test code. I think doing a System.exit isn't a good thing, even if it is expected to be a temporary solution. I noted previously that when running in agentvm mode with platform thread, jtreg doesn't kill/exit the agent JVM if there's an uncaught exception from with the test code. That's one of the reasons why System.exit isn't a good idea here. The other reason is, even if we did go with this System.exit approach, I think that will abruptly bring down the test infrastructure without it having a way to properly report about this test status. I went ahead and applied this proposed patch (to `Virtual` class) locally and ran my test again and as expected it shows that that JVM was abruptly shutdown and jtreg now reports: Starting foo-bar thread Throwing an exception from foo-bar ----------System.err:(3/158)---------- java.lang.RuntimeException: Intentionally thrown exception at FooTest$AlwaysThrows.run(FooTest.java:32) at java.base/java.lang.Thread.run(Thread.java:1570) result: Error. Agent communication error: java.io.EOFException; check console log for any additional details test result: Error. Agent communication error: java.io.EOFException; check console log for any additional details I think to solve this correctly, this entire thing (perhaps even the virtual thread creation) needs to be done within the jtreg project where there's much more access to the internals of jtreg and there might be ways to detect these uncaught exceptions and report them cleanly as test failures. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16369#issuecomment-1784125246