On 22/06/2010 19:26, Jonathan M Davis wrote:
div0 wrote:

On 22/06/2010 07:29, Jonathan M Davis wrote:
Okay. If you call until like so

str.until('\"')

you get a Until!(pred,string,char). I want to turn that into a string.
array() doesn't seem to do the trick right now. It used to work, but now
it gives me

main.d(47): Error: template std.array.array(Range) if (isForwardRange!
(Range)) does not match any function template declaration
main.d(47): Error: template std.array.array(Range) if (isForwardRange!
(Range)) cannot deduce template function from argument types !()(Until!
(pred,string,char))

to!string just converts it into a string with the Until! stuff being
included in the string rather than giving me the actual result, so that
doesn't work.

So, what is the correct and preferred way to convert the result of Until!
to as string when you were searching on a string in the first place? The
std.algorithm functions are definitely nice, but they have tendancy to
return hard-to-use types.

- Jonathan M Davis

Could be wrong, but strings aren't (conceptually) arrays any more.

As I understand it, they're definitely arrays. It's just that they because
they're arrays of char (well immutable(char)) but are read as unicode code
points, the type of the array isn't necessarily a full character and code
that needs to read code points has to treat them as a range of code points
rather than an array of char. So, whether you treat them as an array depends
a bit on what you're doing with them. As long as you're not actually trying
to intrepret them as code points, however, they're the same as any other
array.

I think we're talking about the same thing but with slightly different terminology.

As far as I understand it, for a string (I'm specifically talking immutable(char)) each byte is not a 'code point' or a character.

It may take multiple bytes to encode a 'code point' and it can take multiple code points to encode a character. (Or maybe it's just 2 code points at most, I'm not clear on the details of combining characters)

So it's never valid to randomly access the bytes in a utf-8 string.
If you take a random byte out of string, you might be getting only one byte out a multi byte encoded 'code point'.

(I have the vague recollection that each byte of an encoded
'code point' is itself clearly defined to be an invalid utf
'code point' so you can't accidentally go using one)

Which is bad and therefore you should never conceptually treat a string as an array. (To me array implies random access and that's you are going to be doing such).

When you have an encoded 'code point' in a utf-8 string you have to start mucking about with bit shifting I believe to decode it.

Of course strings are arrays, but I think this is just an implementation detail and might be/probably should be changed in future. Andrei was quite emphatic in one of his posts that strings would from now on be bidirectional ranges.

If you decode a string to dstring then you have a list of code points.
I'm not clear on whether randomly accessing a dstring is a good idea or bad idea though.

If any of that seems wrong please let me know. I'm a bit hazy on the finer points of utf myself and I don't want to carry on making invalid assumptions about it.



They are bidirectional ranges which is why the array call doesn't work.
Though how you actually get a string back I don't know.


I wasn't clear enough. I was basically doing this:

to!string(array(str.until('\"')));

As I understand it, that forces the Until type into an array of whatever
type (probably char[]) and then to!string would convert it to
immutable(char)[]. It's the cleanest way that I found (well, actually, the
only way I think) to convert the result of until() to string in spite of the
fact that it was called with a string in the first place. It's one of the
prices of flexibility, I guess.

- Jonathan M Davis

Sry, means nothing to me. I'm still using dmd 2.028. :(

--
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk

Reply via email to