std::codecvt has in and out methods. These are used to convert from one codeset to another. They return a status:
std::codecvt_base::ok || std::codecvt_base::partial on success or success where the conversion was only partially done std::codecvt_base::noconv if no conversion is necessary std::codecvt_base::error on error (e.g. a character could not be converted because it is an invalid byte sequence). When doing a conversion between char and wchar_t in UTF-8 and Latin-1 locales, this appears to behave correctly. But, when run in a C locale, and e.g. UTF-8 characters are in the input (invalid US-ASCII), it does not return an error, it returns partial or ok, but the pointers to the next character are not updated, leading to an infinite loop because the task is not completed. A testcase is attached. Try running in a UTF-8 locale, then run in a C locale to compare (or comment out the first line of main). Next, remove the UTF-8 chars from the string "foo" in main, and repeat (this works correctly in both UTF-8 and C locales). I think in this case codecvt is failing to correctly report an error when given invalid input. Regards, Roger -- Summary: codecvt causes infinite loop in C locale by not returning an error status on failure Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rleigh at debian dot org GCC build triplet: powerpc-linux-gnu GCC host triplet: powerpc-linux-gnu GCC target triplet: powerpc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28155