On 2013/01/08 3:27, Markus Scherer wrote:

Also, we commonly read code points from 16-bit Unicode strings, and
unpaired surrogates are returned as themselves and treated as such (e.g.,
in collation). That would not be well-formed UTF-16, but it's generally
harmless in text processing.

Things like this are called "garbage in, garbage-out" (GIGO). It may be harmless, or it may hurt you later.

Regards,   Martin.

Reply via email to