On Tue, 5 Dec 2023 10:35:05 GMT, Magnus Ihse Bursie <i...@openjdk.org> wrote:
> We're currently setting LC_ALL=C. Not all tools will default to utf-8 as > their encoding of choice when they see this locale, but use an arbitrarily > encoding, which might not properly handle all UTF-8 characters. Since in > practice, all our encoding is utf8, we should tell our tools this as well. > > This will at least have effect on how Java treats path names including > unicode characters. Of course this was not as easy. One does not simply add "utf8". I got a diff in ./lib/classlist: 401d400 < java/nio/charset/StandardCharsets 1182d1180 < sun/nio/cs/ISO_8859_1 1184,1185d1181 < sun/nio/cs/StandardCharsets$Aliases < sun/nio/cs/StandardCharsets$Cache 1187,1196d1182 < sun/nio/cs/Surrogate < sun/nio/cs/Surrogate$Parser < sun/nio/cs/US_ASCII < sun/nio/cs/US_ASCII$Encoder < sun/nio/cs/UTF_16 < sun/nio/cs/UTF_16BE < sun/nio/cs/UTF_16LE < sun/nio/cs/UTF_32 < sun/nio/cs/UTF_32BE < sun/nio/cs/UTF_32LE 1197a1184 > sun/nio/cs/UTF_8$Encoder 1232d1218 < sun/util/PreHashedMap The PreHashedMap thing looks weird; the other seem definitely character set related. I'll have to investigate this. Oh, shut up Wesley! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16971#issuecomment-1842838066 PR Comment: https://git.openjdk.org/jdk/pull/16971#issuecomment-1920847508