Re: Inconsitency

nickles Sun, 13 Oct 2013 07:15:42 -0700

Ok, I understand, that "length" is - obviously - used in analogyto any array's length value.

Still, this seems to be inconsistent. D elaborates onimplementing "char"s as UTF-8 which means that a "char" in D canbe of any length between 1 and 4 bytes for an arbitrary Unicodecode point. Shouldn't then this (i.e. the character's length) bethe "unit of measurement" for "char"s - like e.g. the size of theunderlying struct in an array of "struct"s? The story continueswith indexing "string"s: In a consistent implementation, shouldn't


   writeln("säд"[2])

return "д" instead of the trailing surrogate of this cyrillicletter?Btw. how do YOU implement this for "string" (for "dstring" itworks - logically, for "wstring" the same problem arises for codepoints above D800)?

Also, I understand, that there is the std.utf.count() functionwhich returns the length that I was searching for. However, why -if D is so UTF-8-centric - isn't this function implemented in thecore like ".length"?

Re: Inconsitency

Reply via email to