Re: RFR: 8266310: deadlock while loading the JNI code [v3]
On Fri, 21 May 2021 15:39:33 GMT, Aleksei Voitylov wrote: >> src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line >> 262: >> >>> 260: } finally { >>> 261: releaseNativeLibraryLock(name); >>> 262: } >> >> The new locking scheme looks incorrect to me. It now seems possible for 2 >> different class loaders in 2 different threads to load the same library >> (which should not be possible). >> >> Thread 1: >> - acquires name lock >> - checks library name is already in `loadedLibraryNames` (it's not) >> - release name lock >> - start loading library >> >> Then thread 2 comes in and does the same >> >> Then again thread 1 finishes loading and: >> - acquires name lock >> - puts library name in `loadedLibraryNames` >> - puts library name in it's own `libraries` >> - release lock >> >> Then thread 2 comes in and does the same again (although adding the name to >> `loadedLibraryNames` will silently return `false`). >> >> It seems that this scenario is possible, and the library will be loaded by 2 >> different class loaders, each with their own `lib` object. (If there's a >> race, there has to be a loser as well, which would need to discard their >> work when finding out they lost) >> >> You might be able to stress this by checking if >> `loadedLibraryNames.add(name);` returns `true`, and throwing an exception if >> not. >> >> I think a possible solution would be to claim the library name up front for >> a particular loader. Then 2 threads can still race to load the same library >> for the same class loader, but 2 threads with 2 different class loaders >> racing to load the library should not be possible. >> >> Actually, you might be able to prevent a race on JNI_OnLoad altogether by >> claiming the library name for a particular thread upfront as well (e.g. >> using a `ConcurrentHashMap`). > > Thank you, the previous version of the PR suffered from other problems as > well. > > The PR is now reverted to a scheme with a lock held on library name, thus > such a situation is no longer possible. Ok, thanks. FWIW, I think trying to lock on a per-name basis instead of globally is a good idea, and should improve the current situation. It is also safe to do as far as I can see. - PR: https://git.openjdk.java.net/jdk/pull/3976
Re: RFR: 8266310: deadlock while loading the JNI code [v3]
On Wed, 19 May 2021 18:47:52 GMT, Jorn Vernee wrote: >> Aleksei Voitylov has updated the pull request incrementally with one >> additional commit since the last revision: >> >> fix trailing whitespace > > src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 262: > >> 260: } finally { >> 261: releaseNativeLibraryLock(name); >> 262: } > > The new locking scheme looks incorrect to me. It now seems possible for 2 > different class loaders in 2 different threads to load the same library > (which should not be possible). > > Thread 1: > - acquires name lock > - checks library name is already in `loadedLibraryNames` (it's not) > - release name lock > - start loading library > > Then thread 2 comes in and does the same > > Then again thread 1 finishes loading and: > - acquires name lock > - puts library name in `loadedLibraryNames` > - puts library name in it's own `libraries` > - release lock > > Then thread 2 comes in and does the same again (although adding the name to > `loadedLibraryNames` will silently return `false`). > > It seems that this scenario is possible, and the library will be loaded by 2 > different class loaders, each with their own `lib` object. (If there's a > race, there has to be a loser as well, which would need to discard their work > when finding out they lost) > > You might be able to stress this by checking if > `loadedLibraryNames.add(name);` returns `true`, and throwing an exception if > not. > > I think a possible solution would be to claim the library name up front for a > particular loader. Then 2 threads can still race to load the same library for > the same class loader, but 2 threads with 2 different class loaders racing to > load the library should not be possible. > > Actually, you might be able to prevent a race on JNI_OnLoad altogether by > claiming the library name for a particular thread upfront as well (e.g. using > a `ConcurrentHashMap`). Thank you, the previous version of the PR suffered from other problems as well. The PR is now reverted to a scheme with a lock held on library name, thus such a situation is no longer possible. - PR: https://git.openjdk.java.net/jdk/pull/3976
Re: RFR: 8266310: deadlock while loading the JNI code [v3]
On Wed, 19 May 2021 16:29:33 GMT, Aleksei Voitylov wrote: >> Please review this PR which fixes the deadlock in ClassLoader between the >> two lock objects - a lock object associated with the class being loaded, and >> the ClassLoader.loadedLibraryNames hash map, locked during the native >> library load operation. >> >> Problem being fixed: >> >> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile >> and the hash map. That deadlock exists even when the ZipFile/JarFile lock is >> removed because there's another lock object in the class loader, associated >> with the name of the class being loaded. Such objects are stored in >> ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads >> exactly the same class, whose signature is being verified in another thread. >> >> Proposed fix: >> >> The proposed patch suggests to get rid of locking loadedLibraryNames hash >> map and synchronize on each entry name, as it's done with class names in see >> ClassLoader.getClassLoadingLock(name) method. >> >> The patch introduces nativeLibraryLockMap which holds the lock objects for >> each library name, and the getNativeLibraryLock() private method is used to >> lazily initialize the corresponding lock object. nativeLibraryContext was >> changed to ThreadLocal, so that in any concurrent thread it would have a >> NativeLibrary object on top of the stack, that's being currently >> loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names >> of all native libraries loaded - in line with class loading code, it is not >> explicitly cleared. >> >> Testing: jtreg and jck testing with no regressions. A new regression test >> was developed. > > Aleksei Voitylov has updated the pull request incrementally with one > additional commit since the last revision: > > fix trailing whitespace Sorry, let me get back to this after I talk with the chalkboard. - PR: https://git.openjdk.java.net/jdk/pull/3976
Re: RFR: 8266310: deadlock while loading the JNI code [v3]
On Wed, 19 May 2021 16:29:33 GMT, Aleksei Voitylov wrote: >> Please review this PR which fixes the deadlock in ClassLoader between the >> two lock objects - a lock object associated with the class being loaded, and >> the ClassLoader.loadedLibraryNames hash map, locked during the native >> library load operation. >> >> Problem being fixed: >> >> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile >> and the hash map. That deadlock exists even when the ZipFile/JarFile lock is >> removed because there's another lock object in the class loader, associated >> with the name of the class being loaded. Such objects are stored in >> ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads >> exactly the same class, whose signature is being verified in another thread. >> >> Proposed fix: >> >> The proposed patch suggests to get rid of locking loadedLibraryNames hash >> map and synchronize on each entry name, as it's done with class names in see >> ClassLoader.getClassLoadingLock(name) method. >> >> The patch introduces nativeLibraryLockMap which holds the lock objects for >> each library name, and the getNativeLibraryLock() private method is used to >> lazily initialize the corresponding lock object. nativeLibraryContext was >> changed to ThreadLocal, so that in any concurrent thread it would have a >> NativeLibrary object on top of the stack, that's being currently >> loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names >> of all native libraries loaded - in line with class loading code, it is not >> explicitly cleared. >> >> Testing: jtreg and jck testing with no regressions. A new regression test >> was developed. > > Aleksei Voitylov has updated the pull request incrementally with one > additional commit since the last revision: > > fix trailing whitespace src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 262: > 260: } finally { > 261: releaseNativeLibraryLock(name); > 262: } The new locking scheme looks incorrect to me. It now seems possible for 2 different class loaders in 2 different threads to load the same library (which should not be possible). Thread 1: - acquires name lock - checks library name is already in `loadedLibraryNames` (it's not) - release name lock - start loading library Then thread 2 comes in and does the same Then again thread 1 finishes loading and: - acquires name lock - puts library name in `loadedLibraryNames` - puts library name in it's own `libraries` - release lock Then thread 2 comes in and does the same again (although adding the name to `loadedLibraryNames` will silently return `false`). It seems that this scenario is possible, and the library will be loaded by 2 different class loaders, each with their own `lib` object. (If there's a race, there has to be a loser as well, which would need to discard their work when finding out they lost) You might be able to stress this by checking if `loadedLibraryNames.add(name);` returns `true`, and throwing an exception if not. I think a possible solution would be to claim the library name up front for a particular loader. Then 2 threads can still race to load the same library for the same class loader, but 2 threads with 2 different class loaders racing to load the library should not be possible. Actually, you might be able to prevent a race on JNI_OnLoad altogether by claiming the library name for a particular thread upfront as well (e.g. using a `ConcurrentHashMap`). - PR: https://git.openjdk.java.net/jdk/pull/3976
Re: RFR: 8266310: deadlock while loading the JNI code [v3]
> Please review this PR which fixes the deadlock in ClassLoader between the two > lock objects - a lock object associated with the class being loaded, and the > ClassLoader.loadedLibraryNames hash map, locked during the native library > load operation. > > Problem being fixed: > > The initial reproducer demonstrated a deadlock between the JarFile/ZipFile > and the hash map. That deadlock exists even when the ZipFile/JarFile lock is > removed because there's another lock object in the class loader, associated > with the name of the class being loaded. Such objects are stored in > ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads > exactly the same class, whose signature is being verified in another thread. > > Proposed fix: > > The proposed patch suggests to get rid of locking loadedLibraryNames hash map > and synchronize on each entry name, as it's done with class names in see > ClassLoader.getClassLoadingLock(name) method. > > The patch introduces nativeLibraryLockMap which holds the lock objects for > each library name, and the getNativeLibraryLock() private method is used to > lazily initialize the corresponding lock object. nativeLibraryContext was > changed to ThreadLocal, so that in any concurrent thread it would have a > NativeLibrary object on top of the stack, that's being currently > loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names of > all native libraries loaded - in line with class loading code, it is not > explicitly cleared. > > Testing: jtreg and jck testing with no regressions. A new regression test > was developed. Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: fix trailing whitespace - Changes: - all: https://git.openjdk.java.net/jdk/pull/3976/files - new: https://git.openjdk.java.net/jdk/pull/3976/files/a0b2163a..8f47d890 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3976&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3976&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3976.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3976/head:pull/3976 PR: https://git.openjdk.java.net/jdk/pull/3976