On 1/17/11 6:44 AM, Steven Schveighoffer wrote:
On Sun, 16 Jan 2011 13:06:16 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
On 1/15/11 9:25 PM, Jonathan M Davis wrote:
Considering that strings are already dealt with specially in order to
have an
element of dchar, I wouldn't think that it would be all that
distruptive to make
it so that they had an element type of Grapheme instead. Wouldn't
that then fix
all of std.algorithm and the like without really disrupting anything?
It would make everything related a lot (a TON) slower, and it would
break all client code that uses dchar as the element type, or is
otherwise unprepared to use Graphemes explicitly. There is no question
there will be disruption.
I would have agreed with you last week. Now I understand that using
dchar is just as useless for unicode as using char.
This is one extreme. Char only works for English. Dchar works for most
languages. It won't work for a few. That doesn't make it useless for
languages that work with it.
Will it be slower? Perhaps. A TON slower? Probably not.
It will be a ton slower.
But it will be correct. Correct and slow is better than incorrect and
fast. If I showed you a shortest-path algorithm that ran in O(V) time,
but didn't always find the shortest path, would you call it a success?
The comparison doesn't apply.
We need to get some real numbers together. I'll see what I can create
for a type, but someone else needs to supply the input :) I'm on short
supply of unicode data, and any attempts I've made to create some result
in failure. I have one example of one composed character in this thread
that I can cling to, but in order to supply some real numbers, we need a
large amount of data.
I very much appreciate that you're doing actual work on this.
Andrei