Re: RFR: 8266310: deadlock while loading the JNI code [v3]

2021-05-21 Thread Jorn Vernee
On Fri, 21 May 2021 15:39:33 GMT, Aleksei Voitylov  
wrote:

>> src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 
>> 262:
>> 
>>> 260: } finally {
>>> 261: releaseNativeLibraryLock(name);
>>> 262: }
>> 
>> The new locking scheme looks incorrect to me. It now seems possible for 2 
>> different class loaders in 2 different threads to load the same library 
>> (which should not be possible).
>> 
>> Thread 1:
>>   - acquires name lock
>>   - checks library name is already in `loadedLibraryNames` (it's not)
>>   - release name lock
>>   - start loading library
>>   
>> Then thread 2 comes in and does the same
>>   
>> Then again thread 1 finishes loading and:
>>  - acquires name lock
>>  - puts library name in `loadedLibraryNames`
>>  - puts library name in it's own `libraries`
>>  - release lock
>>  
>> Then thread 2 comes in and does the same again (although adding the name to 
>> `loadedLibraryNames` will silently return `false`).
>> 
>> It seems that this scenario is possible, and the library will be loaded by 2 
>> different class loaders, each with their own `lib` object. (If there's a 
>> race, there has to be a loser as well, which would need to discard their 
>> work when finding out they lost)
>> 
>> You might be able to stress this by checking if 
>> `loadedLibraryNames.add(name);` returns `true`, and throwing an exception if 
>> not.
>> 
>> I think a possible solution would be to claim the library name up front for 
>> a particular loader. Then 2 threads can still race to load the same library 
>> for the same class loader, but 2 threads with 2 different class loaders 
>> racing to load the library should not be possible.
>> 
>> Actually, you might be able to prevent a race on JNI_OnLoad altogether by 
>> claiming the library name for a particular thread upfront as well (e.g. 
>> using a `ConcurrentHashMap`).
>
> Thank you, the previous version of the PR suffered from other problems as 
> well. 
> 
> The PR is now reverted to a scheme with a lock held on library name, thus 
> such a situation is no longer possible.

Ok, thanks.

FWIW, I think trying to lock on a per-name basis instead of globally is a good 
idea, and should improve the current situation. It is also safe to do as far as 
I can see.

-

PR: https://git.openjdk.java.net/jdk/pull/3976


Re: RFR: 8266310: deadlock while loading the JNI code [v3]

2021-05-21 Thread Aleksei Voitylov
On Wed, 19 May 2021 18:47:52 GMT, Jorn Vernee  wrote:

>> Aleksei Voitylov has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   fix trailing whitespace
>
> src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 262:
> 
>> 260: } finally {
>> 261: releaseNativeLibraryLock(name);
>> 262: }
> 
> The new locking scheme looks incorrect to me. It now seems possible for 2 
> different class loaders in 2 different threads to load the same library 
> (which should not be possible).
> 
> Thread 1:
>   - acquires name lock
>   - checks library name is already in `loadedLibraryNames` (it's not)
>   - release name lock
>   - start loading library
>   
> Then thread 2 comes in and does the same
>   
> Then again thread 1 finishes loading and:
>  - acquires name lock
>  - puts library name in `loadedLibraryNames`
>  - puts library name in it's own `libraries`
>  - release lock
>  
> Then thread 2 comes in and does the same again (although adding the name to 
> `loadedLibraryNames` will silently return `false`).
> 
> It seems that this scenario is possible, and the library will be loaded by 2 
> different class loaders, each with their own `lib` object. (If there's a 
> race, there has to be a loser as well, which would need to discard their work 
> when finding out they lost)
> 
> You might be able to stress this by checking if 
> `loadedLibraryNames.add(name);` returns `true`, and throwing an exception if 
> not.
> 
> I think a possible solution would be to claim the library name up front for a 
> particular loader. Then 2 threads can still race to load the same library for 
> the same class loader, but 2 threads with 2 different class loaders racing to 
> load the library should not be possible.
> 
> Actually, you might be able to prevent a race on JNI_OnLoad altogether by 
> claiming the library name for a particular thread upfront as well (e.g. using 
> a `ConcurrentHashMap`).

Thank you, the previous version of the PR suffered from other problems as well. 

The PR is now reverted to a scheme with a lock held on library name, thus such 
a situation is no longer possible.

-

PR: https://git.openjdk.java.net/jdk/pull/3976


Re: RFR: 8266310: deadlock while loading the JNI code [v3]

2021-05-20 Thread Aleksei Voitylov
On Wed, 19 May 2021 16:29:33 GMT, Aleksei Voitylov  
wrote:

>> Please review this PR which fixes the deadlock in ClassLoader between the 
>> two lock objects - a lock object associated with the class being loaded, and 
>> the ClassLoader.loadedLibraryNames hash map, locked during the native 
>> library load operation.
>> 
>> Problem being fixed:
>> 
>> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile 
>> and the hash map. That deadlock exists even when the ZipFile/JarFile lock is 
>> removed because there's another lock object in the class loader, associated 
>> with the name of the class being loaded. Such objects are stored in 
>> ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads 
>> exactly the same class, whose signature is being verified in another thread.
>> 
>> Proposed fix:
>> 
>> The proposed patch suggests to get rid of locking loadedLibraryNames hash 
>> map and synchronize on each entry name, as it's done with class names in see 
>> ClassLoader.getClassLoadingLock(name) method.
>> 
>> The patch introduces nativeLibraryLockMap which holds the lock objects for 
>> each library name, and the getNativeLibraryLock() private method is used to 
>> lazily initialize the corresponding lock object. nativeLibraryContext was 
>> changed to ThreadLocal, so that in any concurrent thread it would have a 
>> NativeLibrary object on top of the stack, that's being currently 
>> loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names 
>> of all native libraries loaded - in line with class loading code, it is not 
>> explicitly cleared.
>> 
>> Testing:  jtreg and jck testing with no regressions. A new regression test 
>> was developed.
>
> Aleksei Voitylov has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   fix trailing whitespace

Sorry, let me get back to this after I talk with the chalkboard.

-

PR: https://git.openjdk.java.net/jdk/pull/3976


Re: RFR: 8266310: deadlock while loading the JNI code [v3]

2021-05-19 Thread Jorn Vernee
On Wed, 19 May 2021 16:29:33 GMT, Aleksei Voitylov  
wrote:

>> Please review this PR which fixes the deadlock in ClassLoader between the 
>> two lock objects - a lock object associated with the class being loaded, and 
>> the ClassLoader.loadedLibraryNames hash map, locked during the native 
>> library load operation.
>> 
>> Problem being fixed:
>> 
>> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile 
>> and the hash map. That deadlock exists even when the ZipFile/JarFile lock is 
>> removed because there's another lock object in the class loader, associated 
>> with the name of the class being loaded. Such objects are stored in 
>> ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads 
>> exactly the same class, whose signature is being verified in another thread.
>> 
>> Proposed fix:
>> 
>> The proposed patch suggests to get rid of locking loadedLibraryNames hash 
>> map and synchronize on each entry name, as it's done with class names in see 
>> ClassLoader.getClassLoadingLock(name) method.
>> 
>> The patch introduces nativeLibraryLockMap which holds the lock objects for 
>> each library name, and the getNativeLibraryLock() private method is used to 
>> lazily initialize the corresponding lock object. nativeLibraryContext was 
>> changed to ThreadLocal, so that in any concurrent thread it would have a 
>> NativeLibrary object on top of the stack, that's being currently 
>> loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names 
>> of all native libraries loaded - in line with class loading code, it is not 
>> explicitly cleared.
>> 
>> Testing:  jtreg and jck testing with no regressions. A new regression test 
>> was developed.
>
> Aleksei Voitylov has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   fix trailing whitespace

src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 262:

> 260: } finally {
> 261: releaseNativeLibraryLock(name);
> 262: }

The new locking scheme looks incorrect to me. It now seems possible for 2 
different class loaders in 2 different threads to load the same library (which 
should not be possible).

Thread 1:
  - acquires name lock
  - checks library name is already in `loadedLibraryNames` (it's not)
  - release name lock
  - start loading library
  
Then thread 2 comes in and does the same
  
Then again thread 1 finishes loading and:
 - acquires name lock
 - puts library name in `loadedLibraryNames`
 - puts library name in it's own `libraries`
 - release lock
 
Then thread 2 comes in and does the same again (although adding the name to 
`loadedLibraryNames` will silently return `false`).

It seems that this scenario is possible, and the library will be loaded by 2 
different class loaders, each with their own `lib` object. (If there's a race, 
there has to be a loser as well, which would need to discard their work when 
finding out they lost)

You might be able to stress this by checking if `loadedLibraryNames.add(name);` 
returns `true`, and throwing an exception if not.

I think a possible solution would be to claim the library name up front for a 
particular loader. Then 2 threads can still race to load the same library for 
the same class loader, but 2 threads with 2 different class loaders racing to 
load the library should not be possible.

Actually, you might be able to prevent a race on JNI_OnLoad altogether by 
claiming the library name for a particular thread upfront as well (e.g. using a 
`ConcurrentHashMap`).

-

PR: https://git.openjdk.java.net/jdk/pull/3976


Re: RFR: 8266310: deadlock while loading the JNI code [v3]

2021-05-19 Thread Aleksei Voitylov
> Please review this PR which fixes the deadlock in ClassLoader between the two 
> lock objects - a lock object associated with the class being loaded, and the 
> ClassLoader.loadedLibraryNames hash map, locked during the native library 
> load operation.
> 
> Problem being fixed:
> 
> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile 
> and the hash map. That deadlock exists even when the ZipFile/JarFile lock is 
> removed because there's another lock object in the class loader, associated 
> with the name of the class being loaded. Such objects are stored in 
> ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads 
> exactly the same class, whose signature is being verified in another thread.
> 
> Proposed fix:
> 
> The proposed patch suggests to get rid of locking loadedLibraryNames hash map 
> and synchronize on each entry name, as it's done with class names in see 
> ClassLoader.getClassLoadingLock(name) method.
> 
> The patch introduces nativeLibraryLockMap which holds the lock objects for 
> each library name, and the getNativeLibraryLock() private method is used to 
> lazily initialize the corresponding lock object. nativeLibraryContext was 
> changed to ThreadLocal, so that in any concurrent thread it would have a 
> NativeLibrary object on top of the stack, that's being currently 
> loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names of 
> all native libraries loaded - in line with class loading code, it is not 
> explicitly cleared.
> 
> Testing:  jtreg and jck testing with no regressions. A new regression test 
> was developed.

Aleksei Voitylov has updated the pull request incrementally with one additional 
commit since the last revision:

  fix trailing whitespace

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3976/files
  - new: https://git.openjdk.java.net/jdk/pull/3976/files/a0b2163a..8f47d890

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3976&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3976&range=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3976.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3976/head:pull/3976

PR: https://git.openjdk.java.net/jdk/pull/3976