Benjamin Goldberg <[EMAIL PROTECTED]> wrote: > Leopold Toetsch wrote: >> I have problems imaginating such kind of STRINGs.
> You lack sufficient imagination -- Larry's suggested that Perl6 strings > may consist of a list of chunks. I can easily imagine each of those > "chunks" being full-fledged STRING* objects. Did Larry speak of PerlString or STRING? > A foolish question: can you imagine strings which are lazily read from a > file? Sure. > ... If we could have str->strstart as a pointer to a > vector of STRING*s, we wouldn't need any PMC to contain the chunks. And > the str->encoding api is (already) sufficient for doing the work. The > only lack is a custom mark, to keep the sub-strings alive. So you have everything what a string *PMC* has: a list of chunks (hanging off some pointer), custom mark, one or 2 vtables (encoding stuff) ... > If we have it in a PerlString derived class, and do not make it part of > STRING*, then we cannot pass such strings to C functions defined to > accept strings in STRING* parameters, Such C functions must be aware of the string API anyway, they can't assume to get a char * something, they have to call the iterator interface. > Well, except that when a PerlInt loses magic going to an INTVAL, the > resulting integer generally takes *less* memory than it did as a PMC, > whereas losing magic by changing from a PMC to a STRING could very > easily result in using *more* memory. (And doing lots of work, which we > wouldn't need if our string kept it's magic). That's right. But your (or Larry's) proposed list of chunk with custom mark is a PMC effectively, if you call it STRING or not doesn't matter. Its a string PMC with a special vtable. The chunk list contains STRING* buffers. That's it. > my str $slurp = File.new($filename).slurp(); # = > File.slurp($filename)? > Sure, we could have this read in the whole file, but wouldn't it be > nicer if it would *lazily* fill in $slurp? Isn't there a big fat warning in $doc, to avoid such kind of code? Anyway either the string iterator calls the file iterator getting the string or above code is illegal as tie()ing an "int". >> Do you really want to slow down all string access, just for one very >> special corner case? > I don't believe that it *would* slow down all string access. 2 more indirections for the chunk buffer: its variable sized so its a buffer header + buffer memory. And we are creating new strings all over the place which really hurts already now. > For the current string code, we already take O(n) to get a void* pointer > into an appropriate part of a utf8 string, for each character-index. Dan said, we don't do operations on such kind of string encodings. OTOH if the chunks all have a character count, we can quickly locate a certain position inside such strings. leo