On Mon, 10 Mar 2014 15:30:00 -0400, John Colvin <john.loughran.col...@gmail.com> wrote:

On Monday, 10 March 2014 at 18:09:51 UTC, Steven Schveighoffer wrote:

Because one can slice out a multi-code-unit code point, one cannot access it via index. Strings would be horribly crippled without slicing. Without indexing, they are fine.

A possibility is to allow index, but actually decode the code point at that index (error on invalid index). That might actually be the correct mechanism.


In order to be correct, both require exactly the same knowledge: The beginning of a code point, followed by the end of a code point. In the indexing case they just happen to be the same code-point and happen to be one code unit from each other. I don't see how one is any more or less errror-prone or fundamentally wrong than the other.

Using indexing, you simply cannot get the single code unit that represents a multi-code-unit code point. It doesn't fit in a char. It's guaranteed to fail, whereas slicing will give you access to the all the data in the string.

Now, with indexing actually decoding a code point, one can alias a[i] to a[i..$].front(), which means decode the first code point you come to at index i. This means indexing is slow(er), and returns a dchar. I think as a first step, that might be too much to add silently. I'd rather break it first, then add it back later.

-Steve

Reply via email to