Hi Patricio,

On 18/03/2020 6:14 am, Patricio Chilano wrote:
Hi all,

Please review the following patch:

Bug: https://bugs.openjdk.java.net/browse/JDK-8240902
Webrev: http://cr.openjdk.java.net/~pchilanomate/8240902/v1/webrev/

Calling closeConnection() on an already created/opened connection includes calls to CloseHandle() on objects that can still be used by other threads. This can lead to either undefined behavior or, as detailed in the bug comments, changes of state of unrelated objects.

This was a really great find!

This issue was found while debugging the reason behind some jshell test failures seen after pushing 8230594. Not as important, but there are also calls to closeStream() from createStream()/openStream() when failing to create/open a stream that will return after executing "CHECK_ERROR(enterMutex(stream, NULL));" without closing the intended resources. Then, calling closeConnection() could assert if the reason of the previous failure was that the stream's mutex failed to be created/opened. These patch aims to address these issues too.

Patch looks good in general. The internal reference count guards deletion of the internal resources, and is itself safe because never actually delete the connection. Thanks for adding the comment about this aspect.

A few items:

Please update copyright year before pushing.

Please align ENTER_CONNECTION/LEAVE_CONNECTION macros the same way as STREAM_INVARIANT.

 170     unsigned int refcount;
 171     jint state;

I'm unclear about the use of stream->state and connection->state as guards - unless accessed under a mutex these would seem to at least need acquire/release semantics.

Additionally the reads of refcount would also seem to need to some form of memory synchronization - though the Windows docs for the Interlocked* API does not show how to simply read such a variable! Though I note that the RtlFirstEntrySList method for the "Interlocked Singly Linked Lists" API does state "Access to the list is synchronized on a multiprocessor system." which suggests a read of such a variable does require some form of memory synchronization!

 413     while (attempts>0) {

spaces around >

If the loop at 413 never encounters a zero reference_count then it doesn't close the events or the mutex but still returns SYS_OK. That seems wrong but I'm not sure what the right behaviour is here.

And please wait for serviceability folk to review this.

Thanks,
David
-----

Tested in mach5 with the current baseline, tiers1-3 and several runs of open/test/langtools/:tier1 which includes the jshell tests where this connector is used. I also applied patch http://cr.openjdk.java.net/~pchilanomate/8240902/triggerbug/webrev mentioned in the comments of the bug, on top of the baseline and run the langtool tests with and without this fix. Without the fix running around 30 repetitions already shows failures in tests jdk/jshell/FailOverExecutionControlTest.java and jdk/jshell/FailOverExecutionControlHangingLaunchTest.java. With the fix I run several hundred runs and saw no failures. Let me know if there is any additional testing I should do.

As a side note, I see there are a couple of open issues related with jshell failures (8209848) which could be related to this bug and therefore might be fixed by this patch.

Thanks,
Patricio

Reply via email to