On Sunday, 13 October 2013 at 14:14:14 UTC, nickles wrote:
Ok, I understand, that "length" is - obviously - used in
analogy to any array's length value.
Still, this seems to be inconsistent. D elaborates on
implementing "char"s as UTF-8 which means that a "char" in D
can be of any length between 1 and 4 bytes for an arbitrary
Unicode code point. Shouldn't then this (i.e. the character's
length) be the "unit of measurement" for "char"s - like e.g.
the size of the underlying struct in an array of "struct"s? The
story continues with indexing "string"s: In a consistent
implementation, shouldn't
writeln("säд"[2])
return "д" instead of the trailing surrogate of this cyrillic
letter?
I think the root misunderstanding is that you think that a string
is random access.
A string *isn't* random access. They are implemented *inside* an
array, but unless you know *exactly* what you are doing, you
shouldn't index, slice or take the length of a string.
A string should be handled like a bidirectional range.
Once you've understood that, it becomes much simpler.
You want the first character? front.
You want to skip the first character? popFront.
You want an arbitrary character in o(N) time?
myString.dropFrontExactly(N).front;
You want an arbitrary character in o(1) time?
You can't.