On Monday, 10 March 2014 at 18:13:14 UTC, Steven Schveighoffer wrote:
Indexing is rarely a feature one needs or should use, especially with encoded strings.

If I was writing something like a chat or terminal window, I would want to be able to jump to chunks of text based on some sort of buffer length, then search for actual character boundaries. Similarly, if I was indexing text, I don't care what the underlying data is just whether any particular set of n-bytes have been seen together among some document. For the latter case, I don't need to be able to interpret the data as text while indexing, but once I perform an actual search and want to jump the user to that line in the file, being able to take a byte offset that I had stored in the index and convert that to a textual position would be good.

I do think that D should have something like

alias String8 = UTF!char;
alias String16 = UTF!wchar;
alias String32 = UTF!dchar;

And that those sit on top of an underlying immutable(xchar)[] buffer, providing variants of things like foreach and length based on code-point or grapheme boundaries. But I don't think there's any value in reinterpretting "string". Not being a struct or an object, it doesn't have the extensibility to be useful for all the variations of access that working with Unicode and the underlying bytes warrants.

Reply via email to