At 12:19 PM 10/25/2001 -0400, Sam Tregar wrote:
>On Thu, 25 Oct 2001, Dan Sugalski wrote:
>
> > The only bits of the interpreter that much care about the
> > string data are the regex engine parts, and those only operate on
> > fixed-sized data.
>
>Care to elaborate?  I thought the mandate from Larry was to have regexes
>compile down to a stream of string ops.  Doesn't that mean it should work
>regardless of the encoding of the string?

Since the encoding just determines how the abstract code point numbers are 
represented in bytes, I'm OK with requiring strings we process internally 
to be in a fixed-size version.

And regexes will be done with a stream of parrot opcodes, presuming that's 
not too slow. There'll be ops to reference the code point at position X in 
a string and check to see if its in a list of other code points and 
suchlike things. Basically we'll peek under the covers, but only for 
fixed-length strings.

> > The interpreter can only peek inside a string if that string is of
> > fixed length, and the interpreter doesn't actually care about the
> > character set the data is in.
>
>Why is this necessary at all?  Wouldn't it be prefereable to have all
>access go through the String vtable regardless of the encoding?

Speed. We're going to take something of a hit decomposing to ops as it 
is--if we can safely cheat, I'm OK with mandating it to be required. :)

> > =item encoding
> >
> > Pointer to the library that handles the string encoding. Encoding is
> > basically how the stream of bytes pointed to by C<bufstart> can be
> > turned into a stream of 32-bit codepoints. Examples include UTF-8, Big
> > 5, or Shift JIS. Unicode, Ascii, or EBCDIC are B<not> encodings.first
>
>.first?

Trailing buffer gook.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to