Andrei Alexandrescu Wrote: > Lars T. Kyllingstad wrote: > > Nick Sabalausky wrote: > >> "Chris Nicholson-Sauls" <[email protected]> wrote in message > >> news:[email protected]... > >>> Granted LTR is common enough to be expectable and acceptable. To be > >>> perfectly honest, I don't believe I have *ever* even used > >>> wchar/wstring. Char/string gosh yes; dchar/dstring quite a bit as > >>> well, where I need the simplicity; but I've yet to feel much need for > >>> the "weirdo" middle child of UTF. > >>> > >> > >> Given that just about anything outside of D (at least as far as I've > >> seen) that attempts to use unicode does so with UTF-16 (or just uses > >> UCS-2 and pretends that's UTF-16...), wchar and wstring are great for > >> dealing with that. For instance, my Goldie engine for GOLD currently > >> uses wchar in a number of places because GOLD's .cfg format stores > >> text in...well, presumably UTF-16 (I haven't tested to see if it's > >> really UCS-2). But yea, as long as you're not dealing with anything > >> that's already in UTF-16 or that expects it, then it does seem to be > >> somewhat questionable. > > > > I think this says it all: > > > > http://en.wikipedia.org/wiki/Utf-16#Use_in_major_operating_systems_and_environments > > > > > > > > -Lars :) > > Yep, there was a frenzy when UCS-2 came about: everybody thought two > bytes will be enough for everyone. So UCS-2 was widely adopted - who > wouldn't love to have constant character width? Then, the UTF-16 > surrogate business came about, and the only logical step they could take > was to migrate to UTF-16, which was upward compatible to UCS-2. I > personally think UTF-8 is a better overall design though. > > Andrei
"I personally think UTF-8 is a better overall design though." Unicode Technical Note #12 by The Unicode Consortium apparently disagree, recommending UTF-16 for Processing. http://unicode.org/notes/tn12/ The major claim in the TN is that Unicode is optimized for UTF-16. The rest of the argument looks like a VHS (everyone is using it i.e. UTF-16) versus Beta argument. So who's right? My personal view is that whilst they are the *Unicode Consortium*, I have great difficulty in accepting UTF-16 as the one-and-holy encoding. FWIW, there was a subthread during a discussion about the ordained features of programming languages on LtU a while back. http://lambda-the-ultimate.org/node/3166#comment-46233 What Are The Resolved Debates in General Purpose Language Design? Its a long discussion so easier to search for UTF or Unicode on the page if you're interested. cheers Justin Johansson
