I¹ve stumbled across a problem using the C++ JSON encoding within a service running on windows. For example, encoding a multibyte UTF-8 code
point such as: "\xEF\xBD\x81" Incorrectly becomes: "\xEF\xBD\U0081" When encoded in the service running in the windows-1252 locale. This isn¹t a valid UTF-8 sequence so we end up with Mojibake when we try to read back the JSON encoded string. The heart of the problem appears to be that JsonGenerator::doEncodeString relies on calling "iscntrl" to determine whether a given byte is a control character. In the windows-1252 code page the byte "\x81" is a control character but not in the C locale which leads to locale dependent JSON objects, but more importantly, the encoded string is no longer a valid UTF-8 sequence. I've experimented with running the service in the C locale and found that non-ascii code points are encoded correctly. A fix for this would be to use the iscntrl function provided by the <locale> header, like so: http://git.io/XetN-w This makes the determination of whether a given code point is a control character independent of the runtime environment. Let me know whether this looks like a legitimate issue and whether this fix looks appropriate. Many Thanks, Hatem
