https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70893
Bug ID: 70893 Summary: codecvt incorrectly decodes UTF-16 due to optimization Product: gcc Version: 5.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: kirillnow at gmail dot com Target Milestone: --- In libstdc++ source codecvt.cc: inline bool is_high_surrogate(char32_t c) { return c >= 0xD800 && c <= 0xDBFF; } compiles to: if (is_high_surrogate(c)) 0x7ffff7b4d275 lea ecx,[rsi-0xd800] 0x7ffff7b4d27b cmp ecx,0x3ff 0x7ffff7b4d281 ja 0x7ffff7b4d2ad { This code incorrectly decode code points like 0xDE00 (iconv can produce those). GCC and library compilled with -Os. Possible solution: inline bool is_high_surrogate(char32_t c) { return (c&0xFC00)==0xD800; }