On Friday, 13 May 2016 at 21:46:28 UTC, Jonathan M Davis wrote:
The history of why UTF-16 was chosen isn't really relevant to my point (Win32 has the same problem as Java and for similar reasons).

My point was that if you use UTF-8, then it's obvious _really_ fast when you screwed up Unicode-handling by treating a code unit as a character, because anything beyond ASCII is going to fall flat on its face.

On the other hand if you deal with UTF-16 text, you can't interpret it in a way other than UTF-16, people either get it correct or give up, even for ASCII, even with casts, it's that resilient. With UTF-8 problems happened on a massive scale in LAMP setups: mysql used latin1 as a default encoding and almost everything worked fine.

Reply via email to