Re: Why I chose D over Ada and Eiffel

monarch_dodra Tue, 20 Aug 2013 07:01:48 -0700

On Tuesday, 20 August 2013 at 12:59:13 UTC, Andrej Mitrovic wrote:

On 8/19/13, Ramon <s...@thanks.no> wrote:
Plus UTF, too. Even UTF-8, 16 (a very practical compromisein
my minds eye because with 16 bits one can deal with *every*
language while still not wasting memory).
UTF-8 can deal with every language as well. But perhaps youmeant
something else here.

Anyway welcome aboard!

I think he meant that every "modern spoken/written" language fitsin the "Basic Multilingual Plane", for which each codepoint fitsin a single UTF16 code unit (2 bytes). Multiple codeunituncodings in UTF-16 are *very* rare.

On the other hand, if you encode japanese into UTF-8, then you'llspend *3* bytes per codepoint, ergo, "wasted memory".


@ Ramon:
I think that is a fallacy:
http://en.wikipedia.org/wiki/UTF-8#Compared_to_UTF-16

Real world usage is *dominated* by ASCII chars. Unless you have avery specific use case, then, UTF8 will occupy *less* room thanUTF16, even if it contains a lot of foreign characters.

Furthermore, UTF-8 is pretty much the "standard". If you keepUTF-16, you will probably end up regularly transcoding to UTF-8to interface with char* functions.

Arguably, the "only" (IMO) usecase for UTF-16, is interfacingwith windows' UCS-2 API. But even then, there'll still be someoverhead, to make sure you don't have any dual-encoded in yourstreams.

Re: Why I chose D over Ada and Eiffel

Reply via email to