On Wed, 25 Oct 2023 21:08:01 GMT, Leonid Mesnik <lmes...@openjdk.org> wrote:

> The jtreg starts the main thread in a separate ThreadGroup and checks 
> unhandled exceptions for this group. However, it doesn't catch all unhandled 
> exceptions. There is a jtreg issue for this 
> https://bugs.openjdk.org/browse/CODETOOLS-7903526.
> Catching such issues for virtual threads is important because they are not 
> included in any groups. So this fix implements the handler for the test 
> thread factory. 
> 
> A few tests start failing.
> 
> The test
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorVMEventsTest.java
> has testcases for platform and virtual threads. So, there is there's no need 
> to run it with the thread factory.
> 
> The test
> java/lang/Thread/virtual/ThreadAPI.java
> tests UncaughtExceptionHandler and virtual threads. No need to run it with a 
> thread factory.
> 
> Test 
> test/jdk/java/util/concurrent/tck/ThreadTest.java is updated to not check the 
> default empty handler.
> 
> Probably, we need some common approach about dealing with the 
> UncaughtExceptionHandler in jtreg.

Hello Leonid, in order to understand what exactly we are trying to solve here, 
I ran a few tests to see how things work without the changes being proposed in 
this PR. Here's my findings.

A bit of background first. When jtreg runs, either in agent mode or othervm 
mode, it creates a specific thread within which the actual test code runs. In 
either of these modes, it uses a custom jtreg specific `ThreadGroup` instance 
for this thread which is running the test code. This instance of `ThreadGroup` 
has specific overridden implementation of the `public void 
uncaughtException(Thread t, Throwable e)` API which keeps track of uncaught 
exception that might have been thrown by any threads that could have been 
spawned by the test. After `start()`ing the thread which runs the test code, 
the jtreg framework then waits for that thread to complete and once completed 
(either exceptionally or normally), jtreg framework then queries a state on the 
custom `ThreadGroup` instance to see if any uncaught exception(s) were received 
during the lifetime of this thread which ran that test. If it finds any, then 
it marks the test as failed and reports such a failure appropriately in the 
test repo
 rt. As noted, this applies for both the agent mode and other vm mode. Some 
important aspects of this implementation is that:

- The custom `ThreadGroup` which has the overridden implementation of the 
`uncaughtException(Thread t, Throwable e)` method doesn't interfere with the 
JVM level default exception handler.

- After the thread which ran the test code completes, the decision on whether 
to fail or pass a test is taken by checking the custom `ThreadGroup`'s state. 
Once this decision is done, the decision is immediately reported in relevant 
ways and the test status is marked (i.e. finalized) at this point.

- If this was running in agent vm mode, the agent vm mode continues to operate 
and isn't terminated and thus is available for subsequent tests. This point I 
think is important to remember for reasons that will be noted later in this 
comment.

Now coming to the part where in https://bugs.openjdk.org/browse/JDK-8303703 we 
introduced a way where jtreg instead of creating a platform thread (backed by 
its custom implementation of the `ThreadGroup`) to run the test code, now 
checks the presence of a `ThreadFactory` and if present, lets it create a 
`Thread` which runs this test code. In the JDK repo, we have an implementation 
of a `ThreadFactory` which creates a virtual thread instead of platform thread. 
Instances of virtual threads do belong to a `ThreadGroup`. But that instance of 
`ThreadGroup` cannot be controlled by user code and thus it cannot have a 
custom overridden implementation of `uncaughtException(Thread t, Throwable e)`. 
Any `Thread` created within the test code will have its uncaught exception 
handled by the virtual thread's `ThreadGroup` whose internal implementation 
just delegates to the "system" `ThreadGroup` which just writes it out to the 
`System.err` (of course, I am not considering `Thread`s which have been 
 set with a specific uncaught excepiton handler).

Effectively, what this means is when jtreg runs a test using a platform thread 
and if that test creates a `Thread` which throws an uncaught exception, then 
those tests will be marked and reported as failed. If however, jtreg runs a 
test using a virtual thread, then the same test will now be marked as passed. 
Clearly this is not a good thing.

For reference, here's the test I used to verify this behaviour:


import java.util.concurrent.atomic.AtomicBoolean;

/*
 * @test
 * @run main FooTest
 */
public class FooTest {
    public static void main(final String[] args) throws Exception {
        final Thread otherThread = new Thread(new AlwaysThrows());
        otherThread.setName("foo-bar");
        System.out.println("Starting " + otherThread.getName() + " thread");
        otherThread.start();
        try {
            otherThread.join(); // wait for it
        } catch (InterruptedException ie) {
            System.out.println(Thread.currentThread().getName() + " was 
interrupted");
        }
        // verify that the other thread was invoked
        if (!AlwaysThrows.invoked.get()) {
            throw new AssertionError(otherThread.getName() + " wasn't run");
        }
        System.out.println("Test execution done");
    }

    private static class AlwaysThrows implements Runnable {
        private static AtomicBoolean invoked = new AtomicBoolean();

        @Override
        public void run() {
            invoked.set(true);
            System.out.println("Throwing an exception from " + 
Thread.currentThread().getName());
            throw new RuntimeException("Intentionally thrown exception");
        }
    }

}

Running this with jtreg in agentvm mode

jtreg -report:files -verbose:summary -ea -esa -agentvm 
-jdk:build/macosx-aarch64/images/jdk test/jdk/java/lang/FooTest.java
....
#section:main
----------messages:(7/204)----------
command: main FooTest
reason: User specified action: run main FooTest 
started: Sun Oct 29 12:40:14 IST 2023
Mode: agentvm
Agent id: 2
finished: Sun Oct 29 12:40:14 IST 2023
elapsed time (seconds): 0.191
...

----------System.out:(4/109)----------
Starting foo-bar thread
Throwing an exception from foo-bar
AgentVMThread was interrupted
Test execution done
----------System.err:(3/35)----------

JavaTest Message: Test complete.

result: Failed. Execution failed: `main' threw exception: 
java.lang.RuntimeException: Intentionally thrown exception


test result: Failed. Execution failed: `main' threw exception: 
java.lang.RuntimeException: Intentionally thrown exception


Running in othervm mode:


jtreg -report:files -verbose:summary -ea -esa -othervm 
-jdk:build/macosx-aarch64/images/jdk test/jdk/java/lang/FooTest.java
...
#section:main
----------messages:(6/192)----------
command: main FooTest
reason: User specified action: run main FooTest 
started: Sun Oct 29 12:41:22 IST 2023
Mode: othervm
finished: Sun Oct 29 12:41:22 IST 2023
elapsed time (seconds): 0.073
----------configuration:(0/0)----------
----------System.out:(2/59)----------
Starting foo-bar thread
Throwing an exception from foo-bar
----------System.err:(4/255)----------
java.lang.RuntimeException: Intentionally thrown exception
        at FooTest$AlwaysThrows.run(FooTest.java:32)
        at java.base/java.lang.Thread.run(Thread.java:1570)
STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Intentionally 
thrown exception


So in both agentvm and othervm mode it gets marked as failed. Now running the 
same using the virtual thread feature of jtreg:


jtreg -report:files -verbose:summary -ea -esa -agentvm 
-jdk:build/macosx-aarch64/images/jdk  
-testThreadFactoryPath:.../build/macosx-aarch64/support/test/jtreg_test_thread_factory/jtregTestThreadFactory.jar
 -testThreadFactory:Virtual test/jdk/java/lang/FooTest.java
...
#section:main
----------messages:(7/204)----------
command: main FooTest
reason: User specified action: run main FooTest 
started: Sun Oct 29 19:12:33 IST 2023
Mode: agentvm
Agent id: 1
finished: Sun Oct 29 19:12:34 IST 2023
elapsed time (seconds): 0.211
----------System.out:(3/79)----------
Starting foo-bar thread
Throwing an exception from foo-bar
Test execution done
----------System.err:(6/223)----------
Exception in thread "foo-bar" java.lang.RuntimeException: Intentionally thrown 
exception
        at FooTest$AlwaysThrows.run(FooTest.java:32)
        at java.base/java.lang.Thread.run(Thread.java:1570)

JavaTest Message: Test complete.
result: Passed. Execution successful
test result: Passed. Execution successful


jtreg -report:files -verbose:summary -ea -esa -othervm 
-jdk:build/macosx-aarch64/images/jdk  
-testThreadFactoryPath:.../build/macosx-aarch64/support/test/jtreg_test_thread_factory/jtregTestThreadFactory.jar
 -testThreadFactory:Virtual test/jdk/java/lang/FooTest.java
...
#section:main
----------messages:(6/191)----------
command: main FooTest
reason: User specified action: run main FooTest 
started: Sun Oct 29 19:13:22 IST 2023
Mode: othervm
finished: Sun Oct 29 19:13:22 IST 2023
elapsed time (seconds): 0.07
----------configuration:(0/0)----------
----------System.out:(3/79)----------
Starting foo-bar thread
Throwing an exception from foo-bar
Test execution done
----------System.err:(4/203)----------
Exception in thread "foo-bar" java.lang.RuntimeException: Intentionally thrown 
exception
  at FooTest$AlwaysThrows.run(FooTest.java:32)
  at java.base/java.lang.Thread.run(Thread.java:1570)
STATUS:Passed.

In both agentvm and othervm mode it is now marked as passed, which clearly is a 
change in behaviour and not a good thing.


So coming to the proposed solution in this PR. It proposes to do a 
`System.exit(1)` on a uncaught exception being thrown from within `Thread` 
running in test code. I think doing a System.exit isn't a good thing, even if 
it is expected to be a temporary solution. I noted previously that when running 
in agentvm mode with platform thread, jtreg doesn't kill/exit the agent JVM if 
there's an uncaught exception from with the test code. That's one of the 
reasons why System.exit isn't a good idea here. The other reason is, even if we 
did go with this System.exit approach, I think that will abruptly bring down 
the test infrastructure without it having a way to properly report about this 
test status. I went ahead and applied this proposed patch (to `Virtual` class) 
locally and ran my test again and as expected it shows that that JVM was 
abruptly shutdown and jtreg now reports:


Starting foo-bar thread
Throwing an exception from foo-bar
----------System.err:(3/158)----------
java.lang.RuntimeException: Intentionally thrown exception
    at FooTest$AlwaysThrows.run(FooTest.java:32)
    at java.base/java.lang.Thread.run(Thread.java:1570)
result: Error. Agent communication error: java.io.EOFException; check console 
log for any additional details


test result: Error. Agent communication error: java.io.EOFException; check 
console log for any additional details



I think to solve this correctly, this entire thing (perhaps even the virtual 
thread creation) needs to be done within the jtreg project where there's much 
more access to the internals of jtreg and there might be ways to detect these 
uncaught exceptions and report them cleanly as test failures.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16369#issuecomment-1784125246

Reply via email to