On 09/05/01 Dan Sugalski wrote:
> >It's easier to generate code for a stack machine
> 
> So? Take a look at all the stack-based interpreters. I can name a bunch, 
> including perl. They're all slow. Some slower than others, and perl tends 
> to be the fastest of the bunch, but they're all slow.

Have a look at the shootout benchmarks. Yes, we all know that
benchmarks lie, but...
The original mono interpreter (that didn't implement all the semantics
required by IL code that slow down interpretation) ran about 4 times
faster than perl/python on benchmarks dominated by branches, function calls,
integer ops or fp ops.

> >That said, I haven't seen any evidence a register based machine is going to
> >be (significantly?) faster than a stack based one.
> >I'm genuinely interested in finding data about that.
> 
> At the moment a simple mix of ops takes around 26 cycles per opcode on an 
> Alpha EV6. (This is an even mix of branch, test, and integer addition 
> opcodes)  That's with everything sticking in cache, barring task switches. 
> It runs around 110 cycles/op on the reasonably antique machine I have at 
> home. (A 300MHz Celeron (the original, with no cache))

Subliminal message: post the code... :-)

> You're also putting far too much emphasis on registers in general. Most of 
> the work the interpreter will be doing will be elsewhere, either in the 
> opcode functions or in the variable vtable functions. The registers are 

That is true when executing high-level opcodes and a register or stack
machine doesn't make any difference for that. It's not true for
the low-level opcodes that parrot is supposed to handle according to the overview
posted by Simon.

> It'll be faster than perl for low-level stuff because we'll have the option 
> to not carry the overhead of full variables if we don't need it. It should 
> be faster than perl 5 with variables too, which will put us at the top of 
> the performance heap, possibly duking it out with Java. (Though I think 
> perl 5's faster than java now, but it's tough to get a good equivalence 
> there)

Rewriting perl will leave behind all the cruft that accumulated over the years,
so it should not be difficult for parrot to run faster;-)
Java is way faster than perl currently in many tasks: it will be difficult
to beat it starting from a dynamic langauge like perl, we'll all pay
the price to have a useful language like perl.
Most of us are here because they wouldn't program with a strongly typed
language more than for perl's speed. Note also that while java is faster than
perl most of the time, this advantage is completely wasted when you realize
you need 20 megs of RAM to run hello world:-)

> >The only difference in the execution engine is that you need to update
> >the stack pointer. The problem is when you need to generate code
> >for the virtual machine.
> 
> Codegen for register architectures is a long-solved problem. We can reach 
> back 30 or more years for it if we want. (We don't, the old stuff has been 

... when starting from a suitable intermediate representation (i.e., not
machine code for another register machine).

>         push 0
>          pushaddr i
>         store
> foo:  | push i
>       | push 1000
>       | branchgt end
>       | push 7
>       | push i
>       | add
>       | pushaddr i
>       | store
>       | jump foo
> end:
> 
> 
> with the ops executed in the loop marked with pipes. The corresponding 
> parrot code would be:
> 
>        getaddr P0, i
>        store   P0, 0
>        store   I0, 1000
> foo: | branchgt end, P0, I0
>      | add P0, P0, 7
>      | jump foo
[...]
> So, best case (combined store, branch with constant embedded) the stack 
> based scheme has 7 opcodes in the loop, while parrot has 3. With the likely 
> case (what you see above) it's 9.

Well, it would be up to us to design the bytecode, so I'd say it's likely 7.

> Do you really think the stack-based way will be faster?

The speed of the above loop depends a lot on the actual implementation
(the need to do a function call in the current parrot code whould blow
away any advantage gained skipping stack updates, for example).
Also, this example doesn't take into account the convention to do
a function call: where do you put the arguments for a call? Will
you need to push/copy them?

As I said in another mail, I think the stack-based approach will not
be necessarily faster, but it will allow more optimizations down the path.
It may well be 20 % slower in some cases when interpreted, but if it allows 
me to easily JIT it and get 400 % faster, it's a non issue.

lupus

-- 
-----------------------------------------------------------------
[EMAIL PROTECTED]                                     debian/rules
[EMAIL PROTECTED]                             Monkeys do it better

Reply via email to