On 20/05/2021 2:29 am, Aleksei Voitylov wrote:
On Wed, 19 May 2021 16:21:41 GMT, Aleksei Voitylov <avoity...@openjdk.org> 
wrote:

Please review this PR which fixes the deadlock in ClassLoader between the two 
lock objects - a lock object associated with the class being loaded, and the 
ClassLoader.loadedLibraryNames hash map, locked during the native library load 
operation.

Problem being fixed:

The initial reproducer demonstrated a deadlock between the JarFile/ZipFile and 
the hash map. That deadlock exists even when the ZipFile/JarFile lock is 
removed because there's another lock object in the class loader, associated 
with the name of the class being loaded. Such objects are stored in 
ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads 
exactly the same class, whose signature is being verified in another thread.

Proposed fix:

The proposed patch suggests to get rid of locking loadedLibraryNames hash map 
and synchronize on each entry name, as it's done with class names in see 
ClassLoader.getClassLoadingLock(name) method.

The patch introduces nativeLibraryLockMap which holds the lock objects for each 
library name, and the getNativeLibraryLock() private method is used to lazily 
initialize the corresponding lock object. nativeLibraryContext was changed to 
ThreadLocal, so that in any concurrent thread it would have a NativeLibrary 
object on top of the stack, that's being currently loaded/unloaded in that 
thread. nativeLibraryLockMap accumulates the names of all native libraries 
loaded - in line with class loading code, it is not explicitly cleared.

Testing:  jtreg and jck testing with no regressions. A new regression test was 
developed.

Aleksei Voitylov has updated the pull request incrementally with one additional 
commit since the last revision:

   address review comments, add tests

Dear colleagues,

The updated PR addresses review comment regarding ThreadLocal as well as David' 
concern around the lock being held during JNI_OnLoad/JNI_OnUnload calls, and 
ensures all lock objects are deallocated. Multiple threads are allowed to enter 
NativeLibrary.load() to prevent any thread from locking while another thread 
loads a library. Before the update, there could be a class loading lock held by 
a parallel capable class loader, which can deadlock with the library loading 
lock. As proposed by David Holmes, the library loading lock was removed because 
dlopen/LoadLibrary are thread safe and they maintain internal reference 
counters on libraries. There's still a lock being held while a pair of 
containers are read/updated. It's not going to deadlock as there's no lock/wait 
operation performed while that lock is held. Multiple threads may create their 
own copies of NativeLibrary object and register it for auto unloading.

Tests for auto unloading were added along with the PR update. There are now 3 
jtreg tests:
- one checks for deadlock, similar to the one proposed by Chris Hegarty
- two other tests are for library unload.

The major side effect of that multiple threads are allowed to enter is that 
JNI_OnLoad/JNI_OnUnload may be called multiple (but same) number of times from 
concurrent threads. In particular, the number of calls to JNI_OnLoad must be 
equal to the number of calls to JNI_OnUnload after the relevant class loader is 
garbage collected. This may affect the behaviour that relies on specific order 
or the number of JNI_OnLoad/JNI_OnUnload calls. The current JNI specification 
does not mandate how many times JNI_OnLoad/JNI_OnUnload are called. Also, we 
could not locate tests in jck/jtreg/vmTestbase that would rely on the specific 
order or number of calls to JNI_OnLoad/JNI_OnUnload.

But you can't make such a change! That was my point. To fix the deadlock we must not hold a lock. But we must ensure only a single call to JNI_OnLoad is possible. It is an unsolvable problem with those constraints. You can't just change the behaviour of JNI_OnLoad like that.

David
-----

Thank you Alan Bateman, David Holmes and Chris Hegarty for your valuable input.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3976

Reply via email to