Hi all,

We are discussing about source file encoding in PR #12436 [1]

I saw some C4819 warnings on Windows when I tried to build OpenJDK on Windows 
with Japanese locale (CP932). C4819 means the source file contains characters 
which cl.exe cannot be handled in the current code page (CP932 in my case).

I proposed to suppress C4819 in PR #12436, #12437 [2], and #12435 [3]. I heared 
JDK folks have discussed about source file encoding in some times, and it looks 
like that we expect UTF-8.
So I want to propose to add `-utf-8` to CFLAGS for Windows. What do you think?

The change is here: 
https://github.com/YaSuenag/jdk/commit/272678f8f0a74d893d98b507f2c0562bff900b9d


In GCC, the compiler expects UTF-8 as a source file encoding [4].
OTOH cl.exe will use current user code page when the source does not have BOM 
[5] in Windows. So I think we should think about Linux (in other platforms eg 
macOS, I guess we can ignore because we haven't see any reports which relate to 
the locale, and they can be set the locale straightly - WSL cannot do so).

This proposal affects all native components in JDK, so I want to discuss about 
this topic before filing this to JBS and sending PR for this.


And also I think we should describe about source file encoding in some place. It may be 
"Operating System Requirements" in building.md . Let me know if better place.


Thanks,

Yasumasa



[1] https://github.com/openjdk/jdk/pull/12436
[2] https://github.com/openjdk/jdk/pull/12437
[3] https://github.com/openjdk/jdk/pull/12435
[4] https://gcc.gnu.org/onlinedocs/gcc-12.2.0/cpp/Character-sets.html
[5] 
https://learn.microsoft.com/en-us/cpp/build/reference/utf-8-set-source-and-executable-character-sets-to-utf-8?view=msvc-170

Reply via email to