Re: An overview of the Parrot interpreter

Dan Sugalski Mon, 03 Sep 2001 16:15:57 -0700
At 06:37 PM 9/3/2001 -0400, Sam Tregar wrote:
>On Sun, 2 Sep 2001, Simon Cozens wrote:
>
> > For instance, the Parrot VM will have a register architecture, rather
> > than a stack architecture.
>
>s/rather than/as well as/;  # we've got a stack of register frames, right?

Well, register in the sense that most cpus are register machines. They've 
still got stacks, but...

> > There will be global and private opcode tables; that is to say, an area
> > of the bytecode can define a set of custom operations that it will use.
> > These areas will roughly map to compilation units of the original
> > source; each precompiled module will have its own opcode table.
>
>Side note: this isn't making sense to me.  I'm looking forward to further
>explanation!

Basically chunks of perl code can define opcodes on the fly--they might be 
perl subs that meet the proper critera, or opcode functions defined by C 
code with magic stuck in the parser, or wacky optimizer extensions or 
whatever. There won't be a single global table of these, since we can 
potentially be loading in precompiled code. (Modules, say) Each 
"compilation unit" has its own table of opcode number->function maps.

If you want to think of it C-ishly, each object module would have its own 
opcode table.

> > If our PMC is a string and has a vtable which implements Perl-like
> > string operations, this will return the length of the string. If, on the
> > other hand, the PMC is an array, we might get back the number of
> > elements in the array. (If that's what we want it to do.)
>
>Ok, so one example of a PMC is a Perl string...

Nope. Perl scalar. Strings are lower-level, and a little different.

> > Parrot provides a programmer-friendly view of strings. The Parrot string
> > handling subsection handles all the work of memory allocation,
> > expansion, and so on behind the scenes. It also deals with some of the
> > encoding headaches that can plague Unicode-aware languages.
>
>Or not!  Are Perl strings PMCs or not?  Why does Parrot want to handle
>Unicode?  Shouldn't that go in a specific language's string PMC vtables?

Strings are a step below PMCs. And Parrot knows about Unicode because it's 
the least sucky common denominator.

We're not forcing unicode everywhere on purpose. Tried that with perl 5, 
worked OK, but it has some expense, and forces some rude cultural issues.

Most of the non-US world has their own private data encoding methods they 
like just fine. Big5 Traditional, Big5 Simplified, Shift-JIS, EBCDIC, and 
any of a half-zillion variants of ASCII (all with a different set of 
characters in the high 128 slots) work fine for people. If we abstract 
things out a bit so that perl doesn't have to care much, we win in places 
we don't know. This code:

   while (<>) {
         s/$foo/$bar/;
        print;
   }

is cheap on an ASCII machine, but imagine how expensive it'll be if ARGV 
has shift-JIS data sources. We need to transform to Unicode then back 
again, and risk mangling the data in the process. Bleah.

"Everything Unicode" more or less says to the non-Unicode world (i.e. 
everyone except maybe Western Europe and North America) "Boy you suck, but 
I guess we'll make some token accommodations for you."

You can imagine how well that one will go over...

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk
Re: An overview of the Parrot interpreter

Reply via email to