Andrei Alexandrescu wrote:
On 11/22/10 12:01 PM, Steven Schveighoffer wrote:
On Mon, 22 Nov 2010 12:40:16 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
On 11/22/10 11:22 AM, Steven Schveighoffer wrote:
You're dodging the question. You claim that if I want to use it as an
array, I use it as an array, if I want to use it as a range, use it
as a
range. I'm simply pointing out why you can't use it as an array --
because phobos treats it as a bidirectional range, and you can't force
it to do what you want.
Of course you can. After you were to admit that it makes next to no
sense to sort an array of code units, I would have said "well if
somehow you do imagine such a situation, you achieve that by saying
what you means: cast the char[] to ubyte[] and sort that".
That wasn't what you said -- you said I can use char[] as an array if I
want to use it as an array, not that I can use ubyte[] as an array
(nobody disputes that).
That still stays valid. The thing is, sort doesn't sort arrays, it sorts
random-access ranges.
The thing is, *only* when one wants to create strings, does one want to
view the data type as a bidirectional string. When one wants to deal
with chars as an element of a container, I don't want to be restricted
to utf requirements.
If you don't want to be restricted to utf requirements, use ubyte and
ushort. You're saying "I want to use UTF code points without any
associated UTF meaning".
And
easy to understand means easier to avoid mistakes. The point is, the
domain of valid elements in my application is defined by me, not by the
library. The library is making assumptions that my poker hands may
contain utf8 characters, while I know in my case they cannot.
Then what's wrong with ubyte? Why do you encode as UTF something that
you know isn't UTF?
Would you put an integral in a real even though you
know it's only integral?
I don't think that's a valid comparison, since we have integer types,
but we don't have ASCII types.
Here's the issue as I see it: there are very common use cases (and lots
of existing C code) for a type which stores an ASCII character.
I think we're seeing the exact same issue that causes to people to
mistakenly use 'uint' when they mean 'positive integer'.
It LOOKS as though a char is a subset of dchar (ie, a dchar in the range
0..0x7F).
It LOOKS as though a uint is a subset of int (ie, an int in the range
0..int.max).
But in both cases, the possibility that the high bit could be set,
changes the semantics.