On Tue, 25 May 2021 09:45:34 GMT, Maxim Kartashev 
<github.com+28651297+mkartas...@openjdk.org> wrote:

>> Character strings within JVM are produced and consumed in several formats. 
>> Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() 
>> or dlopen()) consume strings also in UTF8. On Windows, however, the 
>> situation is far less simple: some new(er) APIs expect UTF16 (wide-character 
>> strings), some older APIs can only work with strings in a "platform" format, 
>> where not all UTF8 characters can be represented; which ones can depends on 
>> the current "code page".
>> 
>> This commit switches the Windows version of native library loading code to 
>> using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use 
>> of various string formats in the surrounding code. 
>> 
>> Namely, exception messages are made to consume strings explicitly in the 
>> UTF8 format, while logging functions (that end up using legacy Windows API) 
>> are made to consume "platform" strings in most cases. One exception is 
>> `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, 
>> which can, of course, be fixed, but was considered not worth the additional 
>> code (NB: this isn't a new bug).
>> 
>> The test runs in a separate JVM in order to make NIO happy about non-ASCII 
>> characters in the file name; tests are executed with LC_ALL=C and that 
>> doesn't let NIO work with non-ASCII file names even on Linux or MacOS.
>> 
>> Tested by running `test/hotspot/jtreg:tier1` on Linux and 
>> `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (`   
>> jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran 
>> on those platforms as well.
>> 
>> Results from Linux:
>> 
>> Test summary
>> ==============================
>>    TEST                                              TOTAL  PASS  FAIL ERROR 
>>   
>>    jtreg:test/hotspot/jtreg:tier1                     1784  1784     0     0 
>>   
>> ==============================
>> TEST SUCCESS
>> 
>> 
>> Building target 'run-test-only' in configuration 
>> 'linux-x86_64-server-release'
>> Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', 
>> will run:
>> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode
>> 
>> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode'
>> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java
>> Test results: passed: 1
>> 
>> 
>> Results from Windows 10:
>> 
>> Test summary
>> ==============================
>>    TEST                                              TOTAL  PASS  FAIL ERROR
>>    jtreg:test/hotspot/jtreg/runtime                    746   746     0     0
>> ==============================
>> TEST SUCCESS
>> Finished building target 'run-test-only' in configuration 
>> 'windows-x86_64-server-fastdebug'
>> 
>> 
>> Building target 'run-test-only' in configuration 
>> 'windows-x86_64-server-fastdebug'
>> Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run:
>> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode
>> 
>> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode'
>> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java
>> Test results: passed: 1
>
> Maxim Kartashev has refreshed the contents of this pull request, and previous 
> commits have been removed. The incremental views will show differences 
> compared to the previous content of the PR. The pull request contains one new 
> commit since the last revision:
> 
>   8195129: System.load() fails to load from unicode paths

Hi Maxim,

Overall this seems okay. I've focused mainly on the hotspot parts, including 
the test.

A few minor changes requested. I do have some concerns about the impact on 
startup though and the efficiency of the conversion routines.

Thanks,
David

src/hotspot/os/windows/os_windows.cpp line 1462:

> 1460:   const int flag_source_str_is_null_terminated = -1;
> 1461:   const int flag_estimate_chars_count = 0;
> 1462:   int utf16_chars_count_estimated = MultiByteToWideChar(source_encoding,

Your local naming style is somewhat excessive. You could just comment the 
values of the flags when you pass them eg:

MultiByteToWideChar(source_encoding,
                                    MB_ERR_INVALID_CHARS,
                                   source_str,
                                   -1, //source is null-terminated
                                  NULL, // no output buffer
                                  0); // calculate required buffer size

Or you could just add a comment before the call:

// Perform a dummy conversion so that we can get the required size of the 
buffer to
// allocate. The source is null-terminated.

Trying to document parameter semantics by variable naming doesn't work in my 
opinion - at some point if you want to know you have to RTFM for the API.

And utf16_len is perfectly adequate for the returned size.

src/hotspot/os/windows/os_windows.cpp line 1541:

> 1539: void * os::dll_load(const char *utf8_name, char *ebuf, int ebuflen) {
> 1540:   LPWSTR utf16_name = NULL;
> 1541:   errno_t err = convert_UTF8_to_UTF16(utf8_name, &utf16_name);

Do you have any figures on the cost of this additional conversion in relation 
to startup time?

I'm already concerned to see that we have to perform each conversion twice via 
MultiByteToWideChar/WideCharToMultiByte, once to get the size and then to 
actually get the characters! This seems potentially very inefficient.

test/hotspot/jtreg/runtime/jni/loadLibraryUnicode/LoadLibraryUnicode.java line 
48:

> 46:         } else {
> 47:             throw new Error("Unsupported OS");
> 48:         }

Please use the test library function `Platform.sharedLibraryExt()` to get the 
library file extension.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4169

Reply via email to