On 08/20/2010 12:22 PM, Jonathan M Davis wrote:
On Friday, August 20, 2010 09:44:26 Simen kjaeraas wrote:
Rainer Deyke<rain...@eldwood.com>  wrote:
On 8/19/2010 03:56, Jonathan Davis wrote:
The problem is that chars are not characters. They are UTF-8 code
units.

So what?  You're acting like 'char' (and specifically 'char[]') is some
sort of unique special case.  In reality, it's just one case of encoded
data.  What about compressed data?  What about packed arrays of bits?
What about other containers?

First off, char, wchar, and dchar are special cases already - they're
basically byte, short, and int, but are treated somewhat differently.

One possibility, which would make strings a less integrated part of the
language, is to make them simple range structs, and hide UTF-8/16
details in the implementation. If it were not for the fact that D touts
its UTF capabilities, and that this would make it a little less true,
and the fact that char/wchar/dchar are already treated specially, I
would support this idea.

If you do that, you'd probably do something like

struct String(C)
{
     C[] array;

     dchar front() { size_t i = 0; return decod(a, i); }
     dchar back()  { /* more complicated code*/ }
     void popFront() { array.popFront(); }
     void popBack()  { array.popBack(); }
     bool empty()    { return array.empty; }
}

alias String(immutable char) string;

Grep std/ for byDchar.

Andrei

Reply via email to