On Dec 20, 2007, at 10:22 AM, dormando wrote:
So if you have a utf8 character that's a valid non-space high byte for the first byte, but the second byte is a space, it'd break. I'm not even sure if that happens though.
It can't. UTF-8 characters are *always* represented as either straight ASCII or a set of bytes with their high bits set.
http://en.wikipedia.org/wiki/UTF-8#Rationale_behind_UTF-8.27s_design -Steve
