Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Bryan C . Warnock

On Tuesday 09 October 2001 02:11 pm, Michael Maraist wrote:
> I saw another post on the improvements provided by the simple
> modification of:
>
> goto *array[*code];
>
> lbl1:
>   # do stuff
>   goto *array[*code];
> lbl2:
>   # do stuff
>   goto *array[*code];
> ...
> end:
>   return;
>
> And they were right.  I received MASSIVE performance boosts.

Yeah, this is what I coded up into the current Parrot baseline.
I went from 46M to 113M ops/sec.  (On test.pasm.  This is my Linux/Athlon 
system.  The 46M is normalized to a no-input-checking dispatch loop.)

> The next thing to test is having 100 core op-codes.  Then it'll be
> interesting to see how horrible adding the conditionals are.
> Theoretically we don't need any conditionals.  "change_state" op-codes
> can simply return NULL just like end does, thereby breaking out of the
> loop automatically.

Except how do you determine at compile time when to insert change state 
opcodes?  I'm convinced (and humbled :-) by this approach (even though it 
makes gcc scream under Wap rules), but I'm not convinced that we'll have an 
all-inclusive static set of opcode functions.  (Which means that you'll 
still have to check for a valid target.)

>
> I never understood the reasoning behind while(*code) instead of just
> while(code) (which is properly handled here).  Obviously there's value
> to while( code > start && code < end ), but that's in the non-hell-bent
> loop.

As I previously posted, that was probably my typo (left over from caching 
the value on x86) and should have been pc.

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Dan Sugalski

At 03:59 PM 10/9/2001 -0400, Michael Fischer wrote:
>Declaring that static void * array[] = {...} in a header,
>produced by process_opfunc.pl. Dan, Michael, Bryan,
>care for me to revamp my patch to print out a goto
>table like the above?

Sure, if it works and doesn't get in the way of doing it differently.

>Dan, so out of all this, will vtable funcs have to be
>represented in basic_opcodes.ops, or are they above all
>that, and conversions (if any) done elsewhere?

Vtable code goes elsewhere.

>Or, more
>cogently (*cough*), is the method of writing more stuff
>in basic_opcodes.ops and letting the perl scripts handle
>the conversion to C code still valid regardless of what
>we do with vtables, modules, etc.?

Yup. While the vtable code will be processed differently, it'll still be 
processed.


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Michael Fischer

On Oct 09, Michael Maraist <[EMAIL PROTECTED]> took up a keyboard and banged out

> > void
> > runloop(int**code) {
> > static void* array[] = { &&op_end, &&op1, &&op2 };
> >
> > start:
> >   goto *array[*code];
> >
> > op1:
> >   foo; goto start;
> >
> > op2:
> >   foo; goto start;
> >
> > op_end:
> >   return;
> > } // end code


Declaring that static void * array[] = {...} in a header,
produced by process_opfunc.pl. Dan, Michael, Bryan,
care for me to revamp my patch to print out a goto
table like the above?

Dan, so out of all this, will vtable funcs have to be
represented in basic_opcodes.ops, or are they above all
that, and conversions (if any) done elsewhere? Or, more
cogently (*cough*), is the method of writing more stuff
in basic_opcodes.ops and letting the perl scripts handle
the conversion to C code still valid regardless of what
we do with vtables, modules, etc.?

Or will that stuff call for a change to the design of
those perl scripts, and the .ops file?


Michael Fischer
-- 
Developer,  "Beware of bugs in the above code.  
[EMAIL PROTECTED] I have only proven it correct, not tested it."
[EMAIL PROTECTED]-- Donald Knuth



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Michael Fischer

On Oct 09, Michael Maraist <[EMAIL PROTECTED]> took up a keyboard and banged out

> > void
> > runloop(int**code) {
> > static void* array[] = { &&op_end, &&op1, &&op2 };
> >
> > start:
> >   goto *array[*code];
> >
> > op1:
> >   foo; goto start;
> >
> > op2:
> >   foo; goto start;
> >
> > op_end:
> >   return;
> > } // end code


Declaring that static void * array[] = {...} in a header,
produced by process_opfunc.pl. Dan, Michael, Bryan,
care for me to revamp my patch to print out a goto
table like the above?

Dan, so out of all this, will vtable funcs have to be
represented in basic_opcodes.ops, or are they above all
that, and conversions (if any) done elsewhere? Or, more
cogently (*cough*), is the method of writing more stuff
in basic_opcodes.ops and letting the perl scripts handle
the conversion to C code still valid regardless of what
we do with vtables, modules, etc.?

Or will that stuff call for a change to the design of
those perl scripts, and the .ops file?


Michael Fischer
-- 
Developer,  "Beware of bugs in the above code.  
[EMAIL PROTECTED] I have only proven it correct, not tested it."
[EMAIL PROTECTED]-- Donald Knuth



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Dan Sugalski

At 12:54 PM 10/9/2001 -0400, Bryan C. Warnock wrote:
>We are going to need to come up with a better strategy than a DO_OP macro,
>though.

DO_OP can certainly be a placeholder we rip out and replace as part of the 
configuration process. We're not limited to C's macro system if we don't 
want it. Heck, the whole interpreter loop can be reduced to a RUN_SOME_CODE 
macro that's ripped out and replaced if we want.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Dan Sugalski

At 02:11 PM 10/9/2001 -0400, Michael Maraist wrote:
>Oh by the way.
>
>500M ops in 15 seconds is 1.02 Billion parrot ops / second.  This isn't
>too shabby on a mere 466MHZ machine.

The performance is impressive regardless, but isn't that 33M parrot ops/sec 
rather than 1.02G ops/sec?

>Pure-mathematical real-parrot-ops should
>get pretty close to this, but most parrot-ops will have evil conditionals
>that thwart such performance.

Stamping out evil conditionals is one of the reasons for the variable 
vtables. We can't get rid of *all* the conditionals (it'd be a dull program 
that didn't make some decision... :) but if we're going to blow pipeline we 
might as well do it once and real big.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Michael Maraist

> void
> runloop(int**code) {
> static void* array[] = { &&op_end, &&op1, &&op2 };
>
> start:
>   goto *array[*code];
>
> op1:
>   foo; goto start;
>
> op2:
>   foo; goto start;
>
> op_end:
>   return;
> } // end code
>  w/ code=1000, loop_cnt = 5,000,000
> gcc -O3 speeds.c
> --
> switcher time delta=195
> caller time delta=268
> jumper time delta=185
> code compile time delta=0
> fast jumper time delta=154
>

I saw another post on the improvements provided by the simple
modification of:

goto *array[*code];

lbl1:
  # do stuff
  goto *array[*code];
lbl2:
  # do stuff
  goto *array[*code];
...
end:
  return;

And they were right.  I received MASSIVE performance boosts.

gcc -O3 w/ 1000 ops and 5,000,000 loops
jumper code = 31/32 seconds
fast jumper = 32/31 seconds

gcc -O3 w/ 50,000,000 ops and 10 loops
jumper code = 14 seconds
fast jumper = 13 seconds


This is a FULL 500% faster than the fastest above code.  Note also
that the "fast jumper" is only on par with the portable version.  (The
fast jumper modified the source-code to insert physical addresses
instead of using indexes to the array).

Two possible reasons for this.  One is cache-related issues, but I
highly doubt it because even the 50Meg (non-cachable) code was
insanely fast (thanks mostly due to predictable pre-caching).  The
reduction of a branch is most likely the reason.  Previously the
instruction until a branch was like 2 so the pipeline couldn't be
utilized.  Note that more complex instructions wouldn't help; they'd
just be more efficient.  What strikes me as odd though is that I used
a constant jump (goto start).  The Pentium Pro line is supposed to be
able to handle this efficiently.

The next thing to test is having 100 core op-codes.  Then it'll be
interesting to see how horrible adding the conditionals are.
Theoretically we don't need any conditionals.  "change_state" op-codes
can simply return NULL just like end does, thereby breaking out of the
loop automatically.

I never understood the reasoning behind while(*code) instead of just
while(code) (which is properly handled here).  Obviously there's value
to while( code > start && code < end ), but that's in the non-hell-bent loop.

Oh by the way.

500M ops in 15 seconds is 1.02 Billion parrot ops / second.  This isn't
too shabby on a mere 466MHZ machine.  That's a scant 2.18 parrot ops per
cycle.  Getting pretty close to the theoretical max that architecture can
perform.  It ran even faster with the non-cached memory (200Meg sys-ram),
than with short cachable bursts but that's mostly because the prefetcher
was always correct.  Note that this was only because there were only 3
assembly instructions per op-code, and all the op-codes were of the same
form (inc, get-next-op, jump).  Pure-mathematical real-parrot-ops should
get pretty close to this, but most parrot-ops will have evil conditionals
that thwart such performance.

-Michael




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Bryan C . Warnock

On Tuesday 09 October 2001 12:05 pm, Paolo Molaro wrote:
> > It improved the speed of non-optimized code, because you didn't jump to
> > the end of the switch simply to jump back to the loop conditional.  But
> > I didn't see any additional improvements with optimized code, because
> > the optimizers take care of that for you.  (Well, really, they put
> > another copy of the while condition at the bottom.)
>
> Yes, that was basically the same mistake I did at first when testing
> the goto label stuff, little or no improvement. But the 'right' way to do
> it is to use the goto label construct not only instead of switch(), but
> also instead of 'break;'. Oh, and the address array needs to be
> declared const and static, too.
>
>   const static void * goto_map[] = {&&OP1_LABEL, &&OP2_LABEL, ...};
>
>   ...
>   while (1) {
>   goto *goto_map [*ip];   // switch (*ip)
>   OP1_LABEL:
>   do_stuff ();
>   update_ip ();
>   goto *goto_map [*ip];   // break;
>   OP2_LABEL:
>   do_other_stuff ();
>   update_ip ();
>   goto *goto_map [*ip];   // break;
>   ...
>   }

Ooh, I like that.  I'll have to experiment. 

> > I've got three different opcode loops so far for Parrot. 
> > (Linux(x86)/gcc, Solaris(SPARC)/Forte, and Solaris(SPARC)/gcc).  I've
> > tried most every combination I can think of (am still working off the
> > last couple, as a matter of fact).  (Particularly ever since I received
> > the inexplicable slowdown adding a default case.)  Take nothing for
> > granted, and try it all. I've posted some of my more ridiculous
> > failures, and have posted what I have found to be my best numbers. 
> > Anyone is free to come up with a better, faster solution that meets the
> > requirements.
>
> Thanks for your efforts, hard numbers are always better than talk! :-)
> I was just trying to offer the experience I gathered working on similar
> issues, hoping it can be useful. And yes, I was suggesting to change
> a bit the requirements (not of the language, but of the current VM design)
> more than proposing an implementation of it.

That would be up to Dan, then.  I may pitch small tweaks to the design (like 
reserving a set of opcodes strictly for as core internal), but I'm not 
confident enough to do more than that.  

(It's like Grandpa's Five Iron.  (You replace the head, the shaft, and the 
grip, and still call it Grandpa's Five Iron.))

We are going to need to come up with a better strategy than a DO_OP macro, 
though.

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Dan Sugalski

At 12:13 PM 10/9/2001 -0400, Michael Maraist wrote:
>The key was &&lbl which returns the physical address of the a jumpable
>label.  This is highly gcc-specific, but given that we can use
>#ifdef's I don't see a problem with it. :)

No #ifdefs here. Instead the contents of the DO_OP macro should be 
dynamically generated at configure time. (Which is OK, since we need to eat 
all the .ops files at compile time anyway)

The results look really interesting, though. I'm tempted to see how it 
works with potentially 16-bit opcodes. (Which is where noops would come in 
handy as padding to guarantee the 32-bit inlined integers were properly 
aligned)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Michael Maraist

On Tue, 9 Oct 2001, Dan Sugalski wrote:

> At 01:20 PM 10/9/2001 +0200, Paolo Molaro wrote:
> >On 10/07/01 Bryan C. Warnock wrote:
> > > while (*pc) {
> > > switch (*pc) {
> > > }
> > > }
> If anyone wants an official ruling...
>
> DO_OP can contain any code you like as long as it:
>
> *) Some version ultimately compiles everywhere
> *) Allows opcodes above a certain point (probably 256 or 512) to be
> overridden at runtime
>
> Computed gotos, switches, function table dispatches, generated machine
> code, or some other form of coding madness are all OK. I don't care which
> way a particular platform goes as long as it doesn't constrain another
> platform's ability to go another way.

Hehe.  Ok, I delved into the particulars of gcc and found:

void
runloop(int**code) {
static void* array[] = { &&op_end, &&op1, &&op2 };

start:
  goto *array[*code];

op1:
  foo; goto start;

op2:
  foo; goto start;

op_end:
  return;
} // end code

The key was &&lbl which returns the physical address of the a jumpable
label.  This is highly gcc-specific, but given that we can use
#ifdef's I don't see a problem with it. :)

I even wrote an ugly version that avoided that array indirection.
What it did was "convert" the code-base to change the op-codes into
physical addresses.  This obviously has some negative ramifications,
but is definately doable.  By removing that extra indirection I got a
couple percent extra improvement (12%).  I can't imagine you'll get much
faster than that (except for an inlining jit or perl6->C compiler).
The only problem was that the &&lbl's didn't want to be non-local, so
I had to perform some evil magic.

The unpolished code is attached with a cheesy instruction set which
was just designed to prevent the optimizer from optimizing away my
op-codes.

Here are the benchmarks on a 466MHZ dual proc celeron:


w/ code=500,000,000 and loop_cnt = 10
(cache-bound)

gcc speeds.c
-
switcher time delta=45
caller time delta=47
jumper time delta=43
code compile time delta=4
fast jumper time delta=34

gcc -O speeds.c
-
switcher time delta=30
caller time delta=41
jumper time delta=28
code compile time delta=3
fast jumper time delta=26

gcc -O2 speeds.c
---
switcher time delta=30
caller time delta=41
jumper time delta=28
code compile time delta=3
fast jumper time delta=25

gcc -O3 speeds.c

switcher time delta=29
caller time delta=41
jumper time delta=28
code compile time delta=2
fast jumper time delta=26

==
==

 w/ code=1000, loop_cnt = 5,000,000
gcc -O3 speeds.c
--
switcher time delta=195
caller time delta=268
jumper time delta=185
code compile time delta=0
fast jumper time delta=154

# Note a marginal improvement from switch -> jump, yet an even bigger
improvement in jump to fast-jump.

It's unlikely to have a 50M (x4B+) code-base. But it's possible to
have an inner loop that has 500,000 iterations.

-Michael


#include
#include
#include
#include 
#include 
#include   
#include  
#include 


//#define SIZE 5000
#define SIZE 1000
int codep[SIZE];

void
dump_mappings()
{
  char strbuf[1024]; 
  int pid = getpid();
  sprintf( strbuf, "cat /proc/%i/maps", pid  );
  system( strbuf );

  sprintf( strbuf, "ls -l /proc/%i/fd" );
  system( strbuf );

  printf("press enter to continue");
  scanf("%c",strbuf);
 
}

int x,y,z;

void
switcher( int*code)
{
  while(1)
switch(*code) {
case 0: goto DONE;
  break;
case 1: x++;
  code += 1;
  break;
case 2: y++;
  code += 1;
  break;
case 3: z++;
  code += 1;
  break;
case 4: x = y + z;
  code += 1;
  break;
}
 DONE:
} // end switcher



int* end(int*code) { return 0; }
int* o1(int*code) { x++; return code+1; }
int* o2(int*code) { y++; return code+1; }
int* o3(int*code) { z++; return code+1; }
int* o4(int*code) { x = y + z; return code+1; }


typedef int* (*op_handler_t)(int*code);

op_handler_t op_codes[] = {
  end, o1, o2, o3, o4
};

void
caller( int*code)
{
  while(*code)
code = op_codes[ *code ](code);
} // end caller

void
gotoer( int*code)
{
  static int* array[] = { &&lend, &&lo1, &&lo2, &&lo3, &&lo4 };


 start:
  /*
  if ( !code ) {
puts("null code");
exit(-1);
  }
  if ( *code > 3 ) {
puts("Invalid idx");
exit(-1);
  }
  */
  goto *array[ *code ];

  lo1:
x++;
code += 1;
goto start;
  lo2:
y++;
code += 1;
goto start;
  lo3:
z++;
code += 1;
goto start;
  lo4:
z++;
code += 1;
goto start;

 lend:
return;
} // end gotoer

int* 
convert( int size, int*code, int*map )
{
  int* retval, *base;
  int idx =0;
  //dump_mappings();
  retval = base = (int*)malloc(size * sizeof(int));
  if ( !retval ) {
puts("out of memory");
exit(-1);
  }
  //dump_mappings();
  
  //printf("retval=%p, codep=%p\n", retval, codep );
  /*
  if (!retval) {
puts("couldn't allo

Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Paolo Molaro

On 10/09/01 Bryan C. Warnock wrote:
> > About the goto label feature discussed elsewhere, if the dispatch
> > loop is emitted at compile time, there is no compatibility problem
> > with non-gcc compilers, since we know what compiler we are going to use.
> > I got speedups in the 10-20% range with dispatch-intensive benchmarks in
> > mono. It can also be coded in a way similar to the switch code
> > with a couple of defines, if needed, so that the same code compiles
> > on both gcc and strict ANSI compilers.
> 
> I also tested (previously; I need to hit it again) replacing the loop, 
> switch, and breaks with a lot of gotos and labels.
> 
> LOOP:
> /* function look up, if need be */
> switch (*pc) {
> case (1) : { /* yada yada yada */; goto LOOP }
> ...
> }
> 
> It improved the speed of non-optimized code, because you didn't jump to the 
> end of the switch simply to jump back to the loop conditional.  But I didn't 
> see any additional improvements with optimized code, because the optimizers 
> take care of that for you.  (Well, really, they put another copy of the 
> while condition at the bottom.)  

Yes, that was basically the same mistake I did at first when testing
the goto label stuff, little or no improvement. But the 'right' way to do
it is to use the goto label construct not only instead of switch(), but
also instead of 'break;'. Oh, and the address array needs to be
declared const and static, too.

const static void * goto_map[] = {&&OP1_LABEL, &&OP2_LABEL, ...};

...
while (1) {
goto *goto_map [*ip];   // switch (*ip)
OP1_LABEL:
do_stuff ();
update_ip ();
goto *goto_map [*ip];   // break;
OP2_LABEL:
do_other_stuff ();
update_ip ();
goto *goto_map [*ip];   // break;
...
}

> > The problem here is to make sure we really need the opcode swap
> > functionality, it's really something that is going to kill
> > dispatch performance.
> > If a module wants to change the meaning of, eg the + operator,
> > it can simply request the compiler to insert a call to a
> > subroutine, instead of changing the meaning assigned to the
> > VM opcode. The compiler is free to inline the sub, of course,
> > just don't cripple the normal case with unnecessary overhead
> > and let the special case pay the price of flexibility.
> > Of course, if the special case is not so special, a _new_
> > opcode can be introduced, but there is really no reason to
> > change the meaning of an opcode on the fly, IMHO.
> > Comment, or flame, away.
> 
> But how are you going to introduce the new opcode?  Recompile Perl? 

Nope, this is done at the design and early beta stage: we decide which
opcodes make sense and which not and we may add specialized opcodes if
it makes sense to do so. Once the opcode set is defined, it can't be
changed until the next release.

> Unacceptable.  We understand that from a classic language perspective, we're 
> slow and lumbering.  We're Perl.  We need that flexibilty.  We're trying to 
> make that flexibility as fast as possible.

Completely agree. My point is that I don't see a real reason to be
flexible at the opcode level (changing the meaning of the opcodes)
when you can have the same outcome with a cleaner design (set a
different method in a class' vtable).
Many things were done as opcodes in perl5 because calling subroutines
is slow there, IIRC. Many of that features can simply be methods
in some class in perl6: it's cleaner and it's flexible and it doesn't
require to change the meaning of the opcodes.

> I've got three different opcode loops so far for Parrot.  (Linux(x86)/gcc, 
> Solaris(SPARC)/Forte, and Solaris(SPARC)/gcc).  I've tried most every 
> combination I can think of (am still working off the last couple, as a 
> matter of fact).  (Particularly ever since I received the inexplicable 
> slowdown adding a default case.)  Take nothing for granted, and try it all.  
> I've posted some of my more ridiculous failures, and have posted what I have 
> found to be my best numbers.  Anyone is free to come up with a better, 
> faster solution that meets the requirements.

Thanks for your efforts, hard numbers are always better than talk! :-)
I was just trying to offer the experience I gathered working on similar 
issues, hoping it can be useful. And yes, I was suggesting to change
a bit the requirements (not of the language, but of the current VM design)
more than proposing an implementation of it.

lupus

-- 
-
[EMAIL PROTECTED] debian/rules
[EMAIL PROTECTED] Monkeys do it better



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Dan Sugalski

At 10:51 PM 10/7/2001 -0400, Michael Fischer wrote:
>Ok, this is a big one. Over 500 line diff, plus a new module
>for parrot/Parrot.

Thanks a lot for this.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Dan Sugalski

At 01:20 PM 10/9/2001 +0200, Paolo Molaro wrote:
>On 10/07/01 Bryan C. Warnock wrote:
> > while (*pc) {
> > switch (*pc) {
> > }
> > }
>
>With the early mono interpreter I observed a 10% slowdown when I
>checked that the instruction pointer was within the method's code:
>that is not a check you need on every opcode dispatch, but only with
>branches, so it makes sense to have a simple while (1) loop.

Yeah, I really want to toss that as well for the normal case. Checking the 
validity of the instruction pointer's a good thing for the safe(ish) 
version of the interpreter loop, where we're explicitly asking for assurances.

>About the goto label feature discussed elsewhere, if the dispatch
>loop is emitted at compile time, there is no compatibility problem
>with non-gcc compilers, since we know what compiler we are going to use.
>I got speedups in the 10-20% range with dispatch-intensive benchmarks in
>mono. It can also be coded in a way similar to the switch code
>with a couple of defines, if needed, so that the same code compiles
>on both gcc and strict ANSI compilers.

If anyone wants an official ruling...

DO_OP can contain any code you like as long as it:

*) Some version ultimately compiles everywhere
*) Allows opcodes above a certain point (probably 256 or 512) to be 
overridden at runtime

Computed gotos, switches, function table dispatches, generated machine 
code, or some other form of coding madness are all OK. I don't care which 
way a particular platform goes as long as it doesn't constrain another 
platform's ability to go another way.

> > seems simpler, but introduces some potential page (or at least i-cache(?))
> > thrashing, as you've got to do a significant jump just in order to jump
> > again.  The opcode comparison, followed by a small jump, behaves 
> much >nicer.

FWIW (and I know this is out of order) the distance jumped really doesn't 
make any difference in the time it takes to jump. The real indicator if the 
time is what cache the destination is in. A 20 word jump will take longer 
than a 4M jump if the destination for the first is in main memory and the 
second in L1 cache. (L1 cache generally being loaded in 8, 16, or 32 byte 
chunks and L2 cache in 512 or 1024 byte chunks, but that's terribly 
processor-dependent)

> > I've found [2] that the fastest solution (on the platforms I've tested) 
> are
> > within the family:
> >
> > while (*pc) {
> > if (*pc > CORE_OPCODE_NUMBER) {
> > pc = func_table[*pc]();
> > } else {
> > switch (*pc) {
> > }
> > }
>... but adds a comparison even for opcodes that don't need it.

But it removes some uncertainty that the C compiler optimizer might be able 
to use to do better optimizations of the following switch. (Whether it 
*will* or not depends on the smarts built into the compiler) I can fully 
see this as a non-intuitive chunk of code that gets a performance boost.

I'm really thinking that we're going to end up with benchmarked chunks of 
code keyed by OS and processor here, chosen by the configure process. I'm 
half-tempted to weld the benchmark runs into configure itself, but I'm 
leery of it choosing the wrong thing because some cron job fired off in the 
middle of its check.

>The problem here is to make sure we really need the opcode swap
>functionality, it's really something that is going to kill
>dispatch performance.

Dunno about kill, but it will have an impact. I'm fully aware of that, but 
lots of things in dynamic languages in general have performance impacts. 
Sometimes (and whether this is one of those times or not is up in the air) 
you just have to pay it.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Bryan C . Warnock

On Tuesday 09 October 2001 07:20 am, Paolo Molaro wrote:
> On 10/07/01 Bryan C. Warnock wrote:
> > while (*pc) {
> > switch (*pc) {
> > }
> > }
>

Should have been "while (pc)".  Oops.

> With the early mono interpreter I observed a 10% slowdown when I
> checked that the instruction pointer was within the method's code:
> that is not a check you need on every opcode dispatch, but only with
> branches, so it makes sense to have a simple while (1) loop.

Yes, it's amazing how fast "hell-bent" is.  ;-)  In my pre-Parrot testing, I 
did simply do a while (1).  Since, I've used whatever was already coded.  
(Mostly.)  However, I'd need a ruling from Dan on what is acceptable
for the opcode dispatch loop.  I've simply used the existing code to provide 
a baseline behavior. 

>
> About the goto label feature discussed elsewhere, if the dispatch
> loop is emitted at compile time, there is no compatibility problem
> with non-gcc compilers, since we know what compiler we are going to use.
> I got speedups in the 10-20% range with dispatch-intensive benchmarks in
> mono. It can also be coded in a way similar to the switch code
> with a couple of defines, if needed, so that the same code compiles
> on both gcc and strict ANSI compilers.

I also tested (previously; I need to hit it again) replacing the loop, 
switch, and breaks with a lot of gotos and labels.

LOOP:
/* function look up, if need be */
switch (*pc) {
case (1) : { /* yada yada yada */; goto LOOP }
...
}

It improved the speed of non-optimized code, because you didn't jump to the 
end of the switch simply to jump back to the loop conditional.  But I didn't 
see any additional improvements with optimized code, because the optimizers 
take care of that for you.  (Well, really, they put another copy of the 
while condition at the bottom.)  

> > seems simpler, but introduces some potential page (or at least
> > i-cache(?)) thrashing, as you've got to do a significant jump just in
> > order to jump again.  The opcode comparison, followed by a small jump,
> > behaves much nicer.
>
> ... but adds a comparison even for opcodes that don't need it.
> As with the check for the program counter, it's a check that
> not all the opcodes need and as such should be left out of the
> fast path. This means that the first 200 and something opcodes
> are the most common ones _and_ the ones that need to be fast:
> there is no point in avoiding a jump for calls to exit(),
> read() etc (assuming those need opcodes at all).

Well, you've got to represent them somehow.  I haven't come up with any 
clever (or non-clever, for that matter) of doing that.  We toyed with 
earlier of doing contextual switches - of changing the opcode loop itself as 
the context of the opcode stream changes.  Part of that's been done already, 
with the trace and bounds checking.  Currently, though, the opcode loops are 
too heavy to make that efficient.

>
> The problem here is to make sure we really need the opcode swap
> functionality, it's really something that is going to kill
> dispatch performance.
> If a module wants to change the meaning of, eg the + operator,
> it can simply request the compiler to insert a call to a
> subroutine, instead of changing the meaning assigned to the
> VM opcode. The compiler is free to inline the sub, of course,
> just don't cripple the normal case with unnecessary overhead
> and let the special case pay the price of flexibility.
> Of course, if the special case is not so special, a _new_
> opcode can be introduced, but there is really no reason to
> change the meaning of an opcode on the fly, IMHO.
> Comment, or flame, away.

But how are you going to introduce the new opcode?  Recompile Perl?  
Unacceptable.  We understand that from a classic language perspective, we're 
slow and lumbering.  We're Perl.  We need that flexibilty.  We're trying to 
make that flexibility as fast as possible.

I've got three different opcode loops so far for Parrot.  (Linux(x86)/gcc, 
Solaris(SPARC)/Forte, and Solaris(SPARC)/gcc).  I've tried most every 
combination I can think of (am still working off the last couple, as a 
matter of fact).  (Particularly ever since I received the inexplicable 
slowdown adding a default case.)  Take nothing for granted, and try it all.  
I've posted some of my more ridiculous failures, and have posted what I have 
found to be my best numbers.  Anyone is free to come up with a better, 
faster solution that meets the requirements.

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Paolo Molaro

On 10/09/01 Benjamin Stuhl wrote:
> Unfortunately, compiler tricks only work at compile time.
> They're great for static languages like C++ or C#, but Perl
> supports doing
> 
> %CORE::GLOBAL::{'&print'} = \&myprint;
> 
> at _runtime_. This is much to late to be going back and
> patching up any occurences of print_p in the opstream, so
> we need a level of indirection on every overridable opcode.
> (Note that the _overloadable_ ones like the math routines
> don't need that level of indirection - they get it by
> vectoring through the PMC vtables).

And the same is supposed to happen when you override print,
since print is just a method in the Stream 'class' or
whatever it's going to be called.
If there is a 'print' opcode the implementation should do
something like:

case PRINT_OP:
interp->current_stdout->vtable->print (...);

and that will do the right thing anyway.

lupus

-- 
-
[EMAIL PROTECTED] debian/rules
[EMAIL PROTECTED] Monkeys do it better



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Benjamin Stuhl

--- Paolo Molaro <[EMAIL PROTECTED]> wrote:
[snip, snip]
> The problem here is to make sure we really need the
> opcode swap
> functionality, it's really something that is going to
> kill
> dispatch performance.
> If a module wants to change the meaning of, eg the +
> operator,
> it can simply request the compiler to insert a call to a
> subroutine, instead of changing the meaning assigned to
> the
> VM opcode. The compiler is free to inline the sub, of
> course,
> just don't cripple the normal case with unnecessary
> overhead
> and let the special case pay the price of flexibility.
> Of course, if the special case is not so special, a _new_
> opcode can be introduced, but there is really no reason
> to
> change the meaning of an opcode on the fly, IMHO.
> Comment, or flame, away.

Unfortunately, compiler tricks only work at compile time.
They're great for static languages like C++ or C#, but Perl
supports doing

%CORE::GLOBAL::{'&print'} = \&myprint;

at _runtime_. This is much to late to be going back and
patching up any occurences of print_p in the opstream, so
we need a level of indirection on every overridable opcode.
(Note that the _overloadable_ ones like the math routines
don't need that level of indirection - they get it by
vectoring through the PMC vtables).

-- BKS

__
Do You Yahoo!?
NEW from Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-09 Thread Paolo Molaro

On 10/07/01 Bryan C. Warnock wrote:
> while (*pc) {
> switch (*pc) {
> }
> }

With the early mono interpreter I observed a 10% slowdown when I
checked that the instruction pointer was within the method's code:
that is not a check you need on every opcode dispatch, but only with
branches, so it makes sense to have a simple while (1) loop.

About the goto label feature discussed elsewhere, if the dispatch
loop is emitted at compile time, there is no compatibility problem
with non-gcc compilers, since we know what compiler we are going to use.
I got speedups in the 10-20% range with dispatch-intensive benchmarks in
mono. It can also be coded in a way similar to the switch code
with a couple of defines, if needed, so that the same code compiles
on both gcc and strict ANSI compilers.

> I don't see (on simple inspection) a default case, which implies that all 
> functions would be in the switch.  There's two problems with that.  First, 
> you can't then swap out (or add) opcode functions, which compilation units 
> need to do.  They're all fixed, unless you handle opcode differentiation 
> within each case.  (And there some other problems associated with that, too, 
> but they're secondary.)  Second, the switch will undoubtedly grow too large 
> to be efficient.  A full switch with as few as 512 branches already shows 
> signs of performance degradation. [1]  Thirdly, any function calls that you 
> then *do* have to make, come at the expense of the switching branch, on top 
> of normal function call overhead.
> 
> I've found [2] that the fastest solution (on the platforms I've tested) are 
> within the family:
> 
> while (*pc) {
> if (*pc > CORE_OPCODE_NUMBER) {
> pc = func_table[*pc]();
> } else {
> switch (*pc) {
> }
> }
> 
> That keeps the switch branching small.  Do this:
> 
> while (*pc) {
> switch (*pc) {
> case : ...
> default: pc = func_table[*pc]();
> }
> }
> 
> seems simpler, but introduces some potential page (or at least i-cache(?)) 
> thrashing, as you've got to do a significant jump just in order to jump 
> again.  The opcode comparison, followed by a small jump, behaves much nicer.

... but adds a comparison even for opcodes that don't need it.
As with the check for the program counter, it's a check that
not all the opcodes need and as such should be left out of the
fast path. This means that the first 200 and something opcodes
are the most common ones _and_ the ones that need to be fast:
there is no point in avoiding a jump for calls to exit(),
read() etc (assuming those need opcodes at all).

The problem here is to make sure we really need the opcode swap
functionality, it's really something that is going to kill
dispatch performance.
If a module wants to change the meaning of, eg the + operator,
it can simply request the compiler to insert a call to a
subroutine, instead of changing the meaning assigned to the
VM opcode. The compiler is free to inline the sub, of course,
just don't cripple the normal case with unnecessary overhead
and let the special case pay the price of flexibility.
Of course, if the special case is not so special, a _new_
opcode can be introduced, but there is really no reason to
change the meaning of an opcode on the fly, IMHO.
Comment, or flame, away.

lupus

-- 
-
[EMAIL PROTECTED] debian/rules
[EMAIL PROTECTED] Monkeys do it better



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-07 Thread Bryan C . Warnock

Oops, on-list again.

On Monday 08 October 2001 12:12 am, Michael Fischer wrote:
>
> It is generated as a pass over basic_opcodes.ops, just as the function
> definitions in basic_opcodes.c are done. I figured the philosophy was
> "Don't hack the output, hack the template" if you want to change the
> generated file. What swapping of opcodes do you imagine? I wouldn't have
> thought that user code gets to muck with the asm offerings available.


Every compilation unit (like a module, for instance) carries some or all of 
its own opcode table with it.  These are switched out as the scope of the 
compilation unit changes.  External code can and will define their own 
opcode functions, much in the same way that basic_opcodes.ops defines the 
default ones for the Parrot interpreter.  However, to remove the necessity 
of opcode numbering coordination, the opcodes are arbitrarily numbered per 
compilation unit.  So opcode 220 in Foo.pm may be a "leading 0" bit test, 
while opcode 220 in Bar.pm may be pipe creation.  As the interpreter loads 
new code at runtime, it needs to be able to load and dispatch the new opcode.
That's why within the current DO_OP loop, you dereference the current 
function table list from the interpreter structure.

>
> > Second, the switch will undoubtedly grow too large
> > to be efficient.  A full switch with as few as 512 branches already
> > shows signs of performance degradation. [1]  Thirdly, any function calls
> > that you then *do* have to make, come at the expense of the switching
> > branch, on top of normal function call overhead.
>
> How many asm opcodes do you imagine we'll have? I'm hardly expert here,
> but I have a hard time believing we're going to have anything like 512.
> Much of the remaining flexibility would be done in the vtable code.
> Unless of course, _every_ new vtable function has to be re-written as
> a single opcode... oh dear ... but now I'm getting confused.


No.  Each PMC opcode will in turn call a vtable op, which invariably will 
call one of our core ops.  How many will there be?  Perl 5 currently has 
over 300, and that doesn't include *anything* regex related (besides m// and 
s///), which Larry has already deemed will be handled as ops.  I estimate 
that the core interpreter will provide over 800 individual opcodes.  
(Remember, everything is an opcode, from variable creation to scope 
deletion.)

We can mitigate that by segmenting the opcodes into families.  For instance, 
let's say the most efficent dispatch is via a switch no longer than 256 
branches.  Since regular expression ops are most likely going to occur 
mostly in conjunction with other regular expression ops, then we can switch 
to an alternate dispatch loop where the switch is populated with regex 
opcode handlers. (Assuming that they're invariant.  See below.).  Less 
frequently called opcodes, or opcodes that are already timely, can be 
relegated forever to function table lookup.

>
> As I understand it, all the function calls get inlined into the loop,
> so what ones are going to be called inside it?


Opcodes that are invariant, frequent, and quick.  

Invariant - Two integers add together only one way.  I/O input may be done 
any number of ways.  

Frequent - basic math ops occur often.  Exit does not. 

Quick - bit-wise manipulaton of integers should be quick..  Sleep doesn't 
need to be.  

The first of each are good candidates for inlining into the switch.  The 
second are not.

>
> Sorry if I'm being dim and missing issues, I just wanted to get
> this thing done, and I followed instructions. Perhaps needs/designs have
> changed since instructions were given.

Perhaps they have.  The above is the last design direction that I'm aware of.

>
> Um, the latter's include file looks an awful lot like the switch
> my patch generates. How does it work better? Not trying to be
> antagonistic, I just don't see how it is any different/faster.

The switch itself, yes.  That's why I didn't point you to that.  ;-)

You solution should be as fast as the last set that I benchmarked.  (Faster 
even, because the last tweak I made isn't in there, and I am doing the extra 
comparison.)  But you will need to eventually address adding and overriding 
opcodes, which I don't believe your approach does well.  (Which does make 
the difference.)

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-07 Thread Michael Fischer

On Oct 07, "Bryan C. Warnock" <[EMAIL PROTECTED]> took up a keyboard and banged out
> On Sunday 07 October 2001 10:51 pm, Michael Fischer wrote:
> >
> > Questions, comments, criticisms?
> 
> It looks like you're implementing this as a full, solitary switch.
> 
> while (*pc) {
> switch (*pc) {
> }
> }

Right, as per Simon and Dan's original suggestions to me.

> 
> I don't see (on simple inspection) a default case, which implies that all 
> functions would be in the switch.  There's two problems with that.  First, 
> you can't then swap out (or add) opcode functions, which compilation units 
> need to do.  They're all fixed, unless you handle opcode differentiation 
> within each case.  (And there some other problems associated with that, too, 
> but they're secondary.) 

Um, yeah, I left out the issue of default, as I had no idea what to do
with it ( fprintf(stderr, "Hey, non-opcode!"); exit( EXIT_FAILURE ); )???
Anyway, suggestions from the list welcome on that point. 

It is generated as a pass over basic_opcodes.ops, just as the function
definitions in basic_opcodes.c are done. I figured the philosophy was
"Don't hack the output, hack the template" if you want to change the
generated file. What swapping of opcodes do you imagine? I wouldn't have
thought that user code gets to muck with the asm offerings available.

> Second, the switch will undoubtedly grow too large 
> to be efficient.  A full switch with as few as 512 branches already shows 
> signs of performance degradation. [1]  Thirdly, any function calls that you 
> then *do* have to make, come at the expense of the switching branch, on top 
> of normal function call overhead.

How many asm opcodes do you imagine we'll have? I'm hardly expert here,
but I have a hard time believing we're going to have anything like 512.
Much of the remaining flexibility would be done in the vtable code.
Unless of course, _every_ new vtable function has to be re-written as
a single opcode... oh dear ... but now I'm getting confused. 

As I understand it, all the function calls get inlined into the loop,
so what ones are going to be called inside it?

Sorry if I'm being dim and missing issues, I just wanted to get 
this thing done, and I followed instructions. Perhaps needs/designs have 
changed since instructions were given.

> 
> [1] Pre-Parrot testing at 
> http://members.home.net/bcwarno/Perl6/spool/opcode_test_results.txt
> 
> [2] The code for my last set of benchmarks at 
> http://members.home.net/bcwarno/Perl6/spool/interpreter.c

Um, the latter's include file looks an awful lot like the switch
my patch generates. How does it work better? Not trying to be 
antagonistic, I just don't see how it is any different/faster.

Clues welcomed.

Cheers.

Michael
-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 



Re: [PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-07 Thread Bryan C . Warnock

On Sunday 07 October 2001 10:51 pm, Michael Fischer wrote:
>
> Questions, comments, criticisms?

It looks like you're implementing this as a full, solitary switch.

while (*pc) {
switch (*pc) {
}
}

I don't see (on simple inspection) a default case, which implies that all 
functions would be in the switch.  There's two problems with that.  First, 
you can't then swap out (or add) opcode functions, which compilation units 
need to do.  They're all fixed, unless you handle opcode differentiation 
within each case.  (And there some other problems associated with that, too, 
but they're secondary.)  Second, the switch will undoubtedly grow too large 
to be efficient.  A full switch with as few as 512 branches already shows 
signs of performance degradation. [1]  Thirdly, any function calls that you 
then *do* have to make, come at the expense of the switching branch, on top 
of normal function call overhead.

I've found [2] that the fastest solution (on the platforms I've tested) are 
within the family:

while (*pc) {
if (*pc > CORE_OPCODE_NUMBER) {
pc = func_table[*pc]();
} else {
switch (*pc) {
}
}

That keeps the switch branching small.  Do this:

while (*pc) {
switch (*pc) {
case : ...
default: pc = func_table[*pc]();
}
}

seems simpler, but introduces some potential page (or at least i-cache(?)) 
thrashing, as you've got to do a significant jump just in order to jump 
again.  The opcode comparison, followed by a small jump, behaves much nicer.

[1] Pre-Parrot testing at 
http://members.home.net/bcwarno/Perl6/spool/opcode_test_results.txt

[2] The code for my last set of benchmarks at 
http://members.home.net/bcwarno/Perl6/spool/interpreter.c

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



[PATCH] Big patch to have DO_OP as optional switch() statment

2001-10-07 Thread Michael Fischer

Ok, this is a big one. Over 500 line diff, plus a new module
for parrot/Parrot.

This patch adds a prompt to Configure.pl to allow the choice
of DO_OP() as a function dereference (current behavior) or
as a C switch() statement.

Current up until Gregor applied 2 of Bryan's patches this evening.
I hope they don't mess it up, as I haven't the energy tonight
to do more. :-(

This patchset gave the following, when run with options for
function dereference ( the current mode ), and the switch()

[ SuSE linux x86 750Mhz Athlon, 256MB RAM ]
Normal:

Iterations: 1
Start time: 1002505053
End time: 1002505082
Count: 1
Elapsed time:29
Estimated ops:3
Estimated ops (numerically):3.00
Elapsed time:29
Elapsed time:29.00
Ops/sec:10344827.586207

Switched:

Iterations: 1
Start time: 1002507616
End time: 1002507639
Count: 1
Elapsed time:23
Estimated ops:3
Estimated ops (numerically):3.00
Elapsed time:23
Elapsed time:23.00
Ops/sec:13043478.260870


and courtesy of Mr. Schwern (debian on PowerPC, I believe)

Normal:

Iterations: 1
Start time: 1002507775
End time: 1002507838
Count: 1
Elapsed time:63
Estimated ops:3
Estimated ops (numerically):3.00
Elapsed time:63
Elapsed time:63.00
Ops/sec:4761904.761905


Switched:

Iterations: 1
Start time: 1002508550
End time: 1002508596
Count: 1
Elapsed time:46
Estimated ops:3
Estimated ops (numerically):3.00
Elapsed time:46
Elapsed time:46.00
Ops/sec:6521739.130435

--

Questions, comments, criticisms?

Cheers.

-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 


diff -ru parrot/Configure.pl parrot-switched/Configure.pl
--- parrot/Configure.pl Thu Oct  4 16:19:38 2001
+++ parrot-switched/Configure.plSun Oct  7 21:59:27 2001
@@ -85,6 +85,7 @@
perl => $^X,
debugging =>$opt_debugging,
rm_f => 'rm -f',
+   do_op_t =>  'switch',
 );
 
 #copy the things from --define foo=bar
@@ -108,6 +109,7 @@
 prompt("How big would you like integers to be?", 'iv');
 prompt("And your floats?", 'nv');
 prompt("What is your native opcode type?", 'opcode_t');
+prompt("Opcode dispatch by switch or function ('switch' or 'func')", 'do_op_t');
 
 unless( $c{debugging} ) {
$c{ld_debug} = ' ';
diff -ru parrot/MANIFEST parrot-switched/MANIFEST
--- parrot/MANIFEST Sat Oct  6 08:41:57 2001
+++ parrot-switched/MANIFESTSun Oct  7 21:59:35 2001
@@ -14,6 +14,7 @@
 Parrot/String.pm
 Parrot/Test.pm
 Parrot/Vtable.pm
+Parrot/PPP.pm
 Test/More.pm
 Test/Simple.pm
 Test/Utils.pm
diff -ru parrot/Makefile.in parrot-switched/Makefile.in
--- parrot/Makefile.in  Sun Oct  7 10:41:18 2001
+++ parrot-switched/Makefile.in Sun Oct  7 21:59:35 2001
@@ -18,6 +18,7 @@
 PERL = ${perl}
 TEST_PROG = test_prog${exe}
 PDUMP = pdump${exe}
+DO_OP_T = ${do_op_t}
 
 .c$(O):
$(CC) $(CFLAGS) -o $@ -c $<
@@ -31,7 +32,7 @@
 
 $(TEST_PROG): test_main$(O) $(O_FILES) interp_guts$(O) op_info$(O)
$(CC) $(CFLAGS) -o $(TEST_PROG) $(O_FILES) interp_guts$(O) op_info$(O) 
test_main$(O) $(C_LIBS)
-   
+
 $(PDUMP): pdump$(O) packfile$(O) memory$(O) global_setup$(O) string$(O) strnative$(O)
$(CC) $(CFLAGS) -o $(PDUMP) pdump$(O) packfile$(O) memory$(O) global_setup$(O) 
string$(O) strnative$(O) $(C_LIBS)
 
@@ -44,7 +45,7 @@
 strnative$(O): $(H_FILES)
 
 $(INC)/interp_guts.h interp_guts.c $(INC)/op_info.h op_info.c: opcode_table 
build_interp_starter.pl
-   $(PERL) build_interp_starter.pl
+   $(PERL) build_interp_starter.pl -t $(DO_OP_T)
 
 interpreter$(O): interpreter.c $(H_FILES) $(INC)/interp_guts.h
 
@@ -59,7 +60,7 @@
 basic_opcodes$(O): $(H_FILES) basic_opcodes.c
 
 basic_opcodes.c: basic_opcodes.ops process_opfunc.pl $(INC)/interp_guts.h
-   $(PERL) process_opfunc.pl basic_opcodes.ops
+   $(PERL) process_opfunc.pl > basic_opcodes.c
 
 $(INC)/op.h: opcode_table make_op_header.pl
$(PERL) make_op_header.pl opcode_table > $(INC)/op.h
Only in parrot-switched/Parrot: PPP.pm
diff -ru parrot/basic_opcodes.ops parrot-switched/basic_opcodes.ops
--- parrot/basic_opcodes.opsSun Oct  7 11:27:42 2001
+++ parrot-switched/basic_opcodes.ops   Sun Oct  7 21:59:35 2001
@@ -1,11 +1,3 @@
-/* basic_opcodes.c
- *
- * Just some basic opcodes
- *
- */
-
-#include "parrot/parrot.h"
-#include 
 
 /* SET Ix, CONSTANT */
 AUTO_OP set_i_ic {
diff -ru parrot/build_interp_starter.pl parrot-switched/build_interp_starter.pl
--- parrot/build_interp_starter.pl  Sun Oct  7 20:46:15 2001
+++ parrot-switched/build_interp_starter.pl Sun Oct  7 21:59:35 2001
@@ -7,6 +7,16 @@
 
 use strict;
 use Parrot::Opcode;
+use Parrot::PPP;
+
+use Getopt::Std;
+
+use vars qw($opt_t);
+getopts('t:');
+
+die "You