On Monday, 10 March 2014 at 18:13:14 UTC, Steven Schveighoffer
wrote:
Indexing is rarely a feature one needs or should use,
especially with encoded strings.
If I was writing something like a chat or terminal window, I
would want to be able to jump to chunks of text based on some
sort of buffer length, then search for actual character
boundaries. Similarly, if I was indexing text, I don't care what
the underlying data is just whether any particular set of n-bytes
have been seen together among some document. For the latter case,
I don't need to be able to interpret the data as text while
indexing, but once I perform an actual search and want to jump
the user to that line in the file, being able to take a byte
offset that I had stored in the index and convert that to a
textual position would be good.
I do think that D should have something like
alias String8 = UTF!char;
alias String16 = UTF!wchar;
alias String32 = UTF!dchar;
And that those sit on top of an underlying immutable(xchar)[]
buffer, providing variants of things like foreach and length
based on code-point or grapheme boundaries. But I don't think
there's any value in reinterpretting "string". Not being a struct
or an object, it doesn't have the extensibility to be useful for
all the variations of access that working with Unicode and the
underlying bytes warrants.