Eryk Sun <eryk...@gmail.com> added the comment:
I assume you're linking to the CRT dynamically, which is shared with "python39.dll", which means you're sharing the configured locale with Python. Since you're not using an isolated configuration, the LC_CTYPE locale will be set to the current user's default locale (configured in "HKCU\Control Panel\International"). If the STDOUT low I/O file is in ANSI text mode, and the LC_CTYPE locale is not the default "C" locale, and it's a console file, then C write() does a double translated write. First, the UTF-8 byte string is decoded to wide-character UTF-16 using the current LC_CTYPE locale encoding. Then the wide-character string is encoded back to a byte string using the console output code page. The first step leads to mojibake if the locale encoding isn't UTF-8. At a minimum, you'll need to add `cfg.configure_locale = 0` in order to prevent Python from configuring the LC_CTYPE locale to the default user locale. That said, your code should be written to work in locales other than the default "C" locale. For the past few years, Windows ucrt has supported UTF-8 as a locale encoding, such as via setlocale(LC_CTYPE, ".utf8"). Alternatively, or in addition to the latter, you can use std::wcout with wide-character strings and switch stdout to UTF-8 Unicode mode via _setmode(_fileno(stdout), _O_U8TEXT). In this case, the CRT writes to the console via putwch(), which calls the wide-character WinAPI function WriteConsoleW(). If your code uses UTF-8 byte strings, you'll have to decode them to UTF-16 wide-character strings before writing to stdout. ---------- nosy: +eryksun _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43091> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com