Re: ElementType!string

Jakob Ovrum Sun, 25 Aug 2013 13:00:48 -0700

On Sunday, 25 August 2013 at 19:25:08 UTC, qznc wrote:

Apparently, ElementType!string evaluates to dchar. I would haveexpected char. Why is that?

It is mentioned in the documentation of `ElementType`. Use`std.range.ElementEncodingType` or `std.traits.ForeachType` toget `char` and `wchar` when given arrays of those two types.


As for the rationale:

`string`, being an alias for `immutable(char)[]`, is an array ofUTF-8 code units - an array of `char`s. However, it is indeed aforward range of code points (represented as a UTF-32 code unit -`dchar`). It's a (slightly controversial) choice that was made tomake Unicode-correct code the easiest and most intuitive towrite, as code points are much more useful than code units.

Note that it is not a random-access range. UTF-8 is a variablelength encoding, so several code units can be required to encodea single code point. Hence, a non-trivial search is required toget the n'th code point in a UTF-8 or UTF-16 string.

Another name for a code point is "character" (technically, acharacter is what the code point translates to in the UCS).However, it can be a deceptive name - the units we see on screenwhen rendered are "graphemes", as Unicode characters can becombining, zero-width etc.

To get a range of UTF-8 or UTF-16 code units, the code units haveto be represented as something other than `char` and `wchar`. Forexample, you can cast your string to immutable(ubyte)[] tooperate on that, then cast it back at a later point.

Re: ElementType!string

Reply via email to