https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71107
Bug ID: 71107 Summary: wstring_convert::from_bytes produces wide chars with the wrong byte order Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: cantabile.desu at gmail dot com Target Milestone: --- This small program illustrates the problem: #include <locale> #include <codecvt> #include <cstdio> #include <string> int wmain(int argc, wchar_t **argv) { std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t> utf16; printf("Input bytes:\n"); for (size_t i = 0; i < wcslen(argv[0]) * sizeof(wchar_t); i++) printf("%x ", (int)((uint8_t *)argv[0])[i]); printf("\n"); std::string bytes = utf16.to_bytes(argv[0]); printf("Text after to_bytes: '%s'\n", bytes.c_str()); printf("Bytes after to_bytes:\n"); for (size_t i = 0; i < bytes.size(); i++) printf("%x ", (int)((const uint8_t *)bytes.c_str())[i]); printf("\n"); std::wstring wide = utf16.from_bytes(bytes); printf("Bytes after from_bytes:\n"); for (size_t i = 0; i < wide.size() * sizeof(wchar_t); i++) printf("%x ", (int)((const uint8_t *)wide.c_str())[i]); printf("\n"); bytes = utf16.to_bytes(wide); printf("Text after to_bytes: '%s'\n", bytes.c_str()); printf("Bytes after to_bytes:\n"); for (size_t i = 0; i < bytes.size(); i++) printf("%x ", (int)((const uint8_t *)bytes.c_str())[i]); printf("\n"); return 0; } Command: i686-w64-mingw32-g++ -std=c++11 -municode -o test.exe test.cpp -static-libgcc -static-libstdc++ Output when compiled by GCC 6.1.1: Input bytes: 5a 0 3a 0 5c 0 74 0 6d 0 70 0 5c 0 74 0 65 0 73 0 74 0 2e 0 65 0 78 0 65 0 Text after to_bytes: 'Z:\tmp\test.exe' Bytes after to_bytes: 5a 3a 5c 74 6d 70 5c 74 65 73 74 2e 65 78 65 Bytes after from_bytes: 0 5a 0 3a 0 5c 0 74 0 6d 0 70 0 5c 0 74 0 65 0 73 0 74 0 2e 0 65 0 78 0 65 Text after to_bytes: '娀㨀尀琀洀瀀尀琀攀猀琀⸀攀砀攀' Bytes after to_bytes: e5 a8 80 e3 a8 80 e5 b0 80 e7 90 80 e6 b4 80 e7 80 80 e5 b0 80 e7 90 80 e6 94 80 e7 8c 80 e7 90 80 e2 b8 80 e6 94 80 e7 a0 80 e6 94 80 Output when compiled by GCC 5.1.0: Input bytes: 5a 0 3a 0 5c 0 74 0 6d 0 70 0 5c 0 74 0 65 0 73 0 74 0 2e 0 65 0 78 0 65 0 Text after to_bytes: 'Z:\tmp\test.exe' Bytes after to_bytes: 5a 3a 5c 74 6d 70 5c 74 65 73 74 2e 65 78 65 Bytes after from_bytes: 5a 0 3a 0 5c 0 74 0 6d 0 70 0 5c 0 74 0 65 0 73 0 74 0 2e 0 65 0 78 0 65 0 Text after to_bytes: 'Z:\tmp\test.exe' Bytes after to_bytes: 5a 3a 5c 74 6d 70 5c 74 65 73 74 2e 65 78 65 GCC 5.3.0 is affected too. Output of `i686-w64-mingw32-g++ -v`: Using built-in specs. COLLECT_GCC=i686-w64-mingw32-g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-w64-mingw32/6.1.1/lto-wrapper Target: i686-w64-mingw32 Configured with: /build/mingw-w64-gcc/src/gcc-6-20160505/configure --prefix=/usr --libexecdir=/usr/lib --target=i686-w64-mingw32 --enable-languages=c,lto,c++,objc,obj-c++,fortran,ada --enable-shared --enable-static --enable-threads=posix --enable-fully-dynamic-string --enable-libstdcxx-time=yes --with-system-zlib --enable-cloog-backend=isl --enable-lto --disable-dw2-exceptions --enable-libgomp --disable-multilib --enable-checking=release Thread model: posix gcc version 6.1.1 20160505 (GCC) The system is a 64 bit Arch Linux. This GCC was obtained from the "mingw-w64-gcc" package from Arch Linux.