On Thursday, 30 November 2017 at 17:56:58 UTC, Jonathan M Davis
wrote:
On Thursday, November 30, 2017 03:37:37 Walter Bright via
Digitalmars-d wrote:
Language-wise, I think that most of the UTF-16 is driven by the
fact that Java went with UCS-2 / UTF-16, and C# followed them
(both because they were copying Java and because the Win32 API
had gone with UCS-2 / UTF-16). So, that's had a lot of
influence on folks, though most others have gone with UTF-8 for
backwards compatibility and because it typically takes up less
space for non-Asian text. But the use of UTF-16 in Windows,
Java, and C# does seem to have resulted in some folks thinking
that wide characters means Unicode, and narrow characters
meaning ASCII.
- Jonathan M Davis
I think it also simplifies the logic. You are not always looking
to represent the codepoints symbolically. You are just trying to
see what information is in it. Therefore, if you can practically
treat a codepoint as the unit of data behind the scenes, it
simplifies the logic.