Hi Chris,
On 10/07/2018 4:22 AM, Chris Plummer wrote:
Hi David,
Would it be better to problem list this test on solaris using
JDK-8156708. That way when JDK-8156708 is fixed it can come off the
problem list and start executing on solaris.
JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could only
fix this for VM created threads. The general problem of TLS destructors
looping if a thread terminates without detaching from the VM is not
solvable - other than by not using TLS in the VM.
Thanks,
David
thanks,
Chris
On 7/8/18 4:58 PM, David Holmes wrote:
tl;dr skip the new regression test on Solaris
New webrev:
http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
This excludes the test from running on Solaris, so the makefile
doesn't bother compiling this native test and the Java part of the
test adds:
! * @requires os.family != "windows" & os.family != "solaris"
* @summary Basic test of Thread and ThreadMXBean queries on a natively
* attached thread that has failed to detach before
terminating.
+ * @comment The native code only supports POSIX so no windows
testing; also
+ * we have to skip solaris as a terminating thread that
fails to
+ * detach will hit an infinite loop due to TLS destructor
issues - see
+ * comments in JDK-8156708
Note this means that Solaris is not affected by the original issue
because a still-attached native thread can't actually terminate due to
the TLS destructor infinite-loop issue.
Thanks,
David
On 6/07/2018 6:07 PM, David Holmes wrote:
<sigh> The new test is hanging on Solaris. I just discovered we don't
run these tests on Solaris until tier4.
David
On 6/07/2018 8:40 AM, David Holmes wrote:
Hi Chris,
Thanks for looking at this.
Updated webrev:
http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
Only real changes in ji05t001.c. (And fixed typo in the new test)
More below ...
On 6/07/2018 7:55 AM, Chris Plummer wrote:
Hi David,
Solaris problems aside, overall it looks fine. Some minor things I
noted:
I noticed that exitCode is never modified in agentA() or agentB(),
so there isn't much point to having it. If you reach the bottom of
the function, it passed, so PASSED can be returned. The code would
be more clear if it did this. As-is it is implied that you can
reach the bottom when it fails.
I resisted any and all urges to do any kind of unrelated code
cleanup in the tests - once you start you may end up doing a full
rewrite.
Is detaching the threads along the failure paths really needed?
exit() is called, so this would seem to make it unnecessary.
You're right that isn't necessary. I'll remove the changes from
before the exits in ji05t001.c
I prefer assignments not to be embedded inside the "if" condition.
The DetachCurrentThread code in THREAD_return() is much more
readable than the similar code in agentA() and agentB().
It's an existing style already used in that test e.g.
287 if ((res =
288 JNI_ENV_PTR(vm)->AttachCurrentThread(
289 JNI_ENV_ARG(vm, (void **) &env), (void *) 0))
!= 0) {
and I don't mind it, so I'd prefer not to change it.
In the test:
54 // Generally as long as we don't crash of throw
unexpected
55 // exceptions then the test passes. In some cases we
know exactly
"of" should be "or".
Well spotted. Thanks.
Shouldn't you be catching exceptions for all the Thread methods you
are calling? Otherwise the test will exit if one is thrown, and the
above comment indicates that you don't want this.
I'm not expecting there to be any exceptions from any of the called
methods. That would potentially indicate a problem in handling the
terminated native thread, so would indicate a test failure.
Don't we normally put these tests in a package?
Doesn't seem to be any hard and fast rule. I only uses packages when
they are important for the test. In runtime we have 905 java files
and only 116 have a package statement. It varies elsewhere.
Thanks,
David
thanks,
Chris
On 7/5/18 2:58 AM, David Holmes wrote:
<sigh> Solaris compiler complains about doing a return from inside
a do-while loop. I'll have to rework part of the fix tomorrow.
David
On 5/07/2018 6:19 PM, David Holmes wrote:
Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
Problem:
The tests create native threads that attach to the VM through
JNI_AttachCurrentThread but which then terminate without
detaching themselves. When the VM exits and we're using Flight
Recorder "dumponexit" this leads to a call to VM_PrintThreads
that in part wants to print the per-thread CPU usage. When we
encounter the threads that have terminated already the low level
pthread_getcpuclockid calls returns ESRCH but the code doesn't
expect that and so fails an assert in debug mode and can SEGV in
product mode.
Solution:
Serviceability-side: fix the tests
Change the tests so that the threads detach before terminating.
The two tests are (surprisingly) written in completely different
styles, so the solution also takes on two different styles.
Runtime-side: make the VM more robust in the fact of JNI attached
threads that terminate before detaching, and add a regression test
I took a good look at the low-level code for interacting with
arbitrary threads and as far as I can see the problem only exists
for this one case of pthread_getcpuclockid on Linux. Elsewhere
the potential for a library call failure just reports an error
value (such as -1 for the cpu time used).
So the fix is simply to allow for ESRCH when calling
pthread_getcpuclockid and return -1 for the cpu usage in that case.
I created a new regression test to create a new native thread,
attach it and then let it terminate while still attached. The
java code then calls various Thread and ThreadMXBean functions on
it to ensure there are no crashes or unexpected exceptions.
Testing:
- old tests with fixed run-time
- old run-time with fixed tests
- mach tier4 (which exposed the problem - that's where we
enable Flight recorder for the tests) [in progress]
- mach5 tier 1-3 for good measure [in progress]
- new regression test
Thanks,
David