On 21-Oct-2015 19:21, Shriramana Sharma wrote:
Shriramana Sharma wrote:

iterating through a
string as a range will produce each semantically meaningful Unicode
character rather than each UTF-8 or UTF-16 codepoint, it does make sense
to do this.

Dear me... I meant UTF-8 encoded byte, rather than "codepoint", since all
characters have codepoints, but not all codepoints (such as the surrogates)
correspond to characters.


Aye, careful here. Unicode is a slippery road... Not even talking of code units and code points, there are things like "abstract character" and "user-perceived character". well, I tried my best to summarize most of it at:
http://dlang.org/phobos/std_uni.html

--
Dmitry Olshansky

Reply via email to