Re: Higher level built-in strings

Walter Bright Mon, 19 Jul 2010 21:30:26 -0700

bearophile wrote:

Walter Bright:

1. most string operations, such as copying and searching, even regularexpressions, work just fine using regular indices.
2. doing the operations in (1) using code points and having to continually
 decode the strings would result in disastrously slow code.


In my original post I have forgotten another difference over arrays: 5b) a
method like ".unit()" that allows to index code units. So "foo".unit(1) is
always O(1). Lower level code can use this method as [] is used for arrays.

This is backwards. The [i] should behave as expected for arrays. As it turnsout, indexing by byte is *far* more common than indexing by code unit, in fact,I've never ever needed to index by code unit.

(Though it is sometimes necessary to step through by code unit, that's differentfrom indexing by code unit.)

3. the user can always layer a code point interface over the strings, but
going the other way is not so practical.


This is true. But it makes the string usage unnecessarily low-level and
hard...

I don't believe that manipulating strings in D is hard, even if you do have towork with multibyte characters. You do have to be aware they are multibyte, butI think that just comes with being a programmer.



 A better design in a smart system language as D is to give strings a

default high level "interface" that sees strings as what they are at high
level, and add a second lower level interface when you need faster
lower-level fiddling (so they have [] that returns code points and unit()
that returns code units).

I have some moderate experience with using utf. First there's the D javascriptengine, which is fully utf'd. The D string design fits in with it perfectly.Then there are chunks of C++ ascii-only code I've translated to D, and it thenworked with utf-8 without further modification.

Based on that, I believe the D string design hits the sweet spot betweenefficiency and utility.

Re: Higher level built-in strings

Reply via email to