On 21-Oct-2015 19:21, Shriramana Sharma wrote:
Shriramana Sharma wrote:
iterating through a
string as a range will produce each semantically meaningful Unicode
character rather than each UTF-8 or UTF-16 codepoint, it does make sense
to do this.
Dear me... I meant UTF-8 encoded byte, rather than "codepoint", since all
characters have codepoints, but not all codepoints (such as the surrogates)
correspond to characters.
Aye, careful here. Unicode is a slippery road... Not even talking of
code units and code points, there are things like "abstract character"
and "user-perceived character". well, I tried my best to summarize most
of it at:
http://dlang.org/phobos/std_uni.html
--
Dmitry Olshansky