Thanks.

I think strcpy(ret+2, "1252") vs. strcpy(ret, "Cp1252")  is a just matter of style. I would prefer the later, as it makes the intent clear.
But not my call.

Do you have any info how I can change the detected codepage there? I wrote a small C program that basically just does that part and printf it. In my limited tests (windows likes to require a restart after each configuration change) I did not find a way to influence that.

An other thing to consider is if Cp65001 should be treated as UTF-8 in that function? (As said before, locale is not my expertise. Can that function with that LCSID even return 65001?) I can see how things go wrong if it returns 65001 as locale, so... could be a safe change? (I'm sure that things break if that function returns 65001.)

Then there is the other part:
The mismatch between the comment in jni_util.c/newSizedStringJava and the implementation on the Java side. There is no fallback to iso-8859-1. If new String(byte[]) is called before the system properties are installed, then this will lead to a NullPointerException. And there is a code path that leads to exactly that - newPlatformString is called from the initialization of the properties. [1] And if the encoding returned by the windows function is not supported, then it will call new String(byte[]) - during system property initialization.

- Johannes

[1]: https://hg.openjdk.java.net/jdk/jdk/file/d40d865753fb/src/java.base/share/native/libjava/System.c#l207

On 08-May-20 18:27, naoto.s...@oracle.com wrote:
Ditto. Good catch!

I am not sure the fix would address the issue in 8226810 (cannot confirm it either, as my Windows box is at my office where I cannot enter at the moment :-), but this definitely looks like a bug. I would change the additional line to "strcpy(ret+2, "1252");" as Cp is copied in the following switch.

Naoto



On 5/7/20 5:50 AM, Alan Bateman wrote:
On 07/05/2020 12:37, Johannes Kuhn wrote:
:

In the end, I don't know what causes the bug, or how I can replicate it.
I think I did find a likely suspect.
Good sleuthing. I don't what the conditions are for GetLocaleInfo to fail but it does look like that would return possibly non-terminated garbage starting with "CP" so we should at least fix that.

The issue in JDK-8226810 might be something else. One of the submitters to that issue did engage and provided enough information to learn that the locale is zh_CN and also reported that it was failing for GB18030. GB18030 is not in java.base so that at least explained that report.

-Alan


Reply via email to