Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)

2004-11-18 Thread Patrick R. Michaud
On Wed, Nov 17, 2004 at 10:35:47PM +, Nicholas Clark wrote:
 On Wed, Nov 17, 2004 at 02:47:09PM -0700, Patrick R. Michaud wrote:
 
  BTW, it may be very possible for me to write the p6ge generator so 
  that it can be switched between the PIR and bsr/ret calling conventions,
  so we don't need to resolve this entirely now.  And we could then benchmark
  the two against each other.
 
 Keeping the code that flexible would be very interesting. If you can
 achieve this without much extra pain, I think that it would be worth it.

Believe it or not, I think it is less pain to do it this way
(keeping the conventions switchable) -- it certainly cleans up the
generator code a bit.  We'll see how it works out.  :-)

Pm


Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)

2004-11-17 Thread Dan Sugalski
At 10:03 PM +0100 11/17/04, Leopold Toetsch wrote:
Dan Sugalski wrote:
[ this came up WRT calling conventions ]
I assume he's doing bsr/ret to get into and out of the sub, which 
is going to be significantly faster.
Who says that?
As already stated, I don't consider these as either light-weight nor 
faster. Here is a benchmark.

Below are 2 version of a recursive factorial program. fact(100) is 
calculated 1000 times:

PIR   1.1 s
bsr/ret   2.4 s
PIR/tailcall  0.2s
Unoptimized Parrot, default i.e. slow run core.
Way to go with the overkill. I'm impressed. However, written more 
sanely the results are:

PIR:
real0m4.149s
user0m4.120s
sys 0m0.030s
bsr/ret:
real0m1.266s
user0m1.260s
sys 0m0.000s
Chopping out the multiplication (since that's a not-insignificant 
amount of the runtime for the bsr/ret version) gives:

PIR:
real0m3.016s
user0m2.990s
sys 0m0.030s
bsr/ret
real0m0.344s
user0m0.340s
sys 0m0.010s
The bsr/ret version is:
start:
new P16, .PerlInt
set P16, 1000
elements I16, P5
lt I16, 2, def
set S0, P5[1]
set P16, S0
def:   
set I16, 0
time N16
save N16

loop:
clone P1, P16
new P0, .PerlInt
set P0, 1
save P16
save I16
bsr fact
restore I16
restore P16
inc I16
lt I16, 1000, loop
restore N16
time N17
sub N17, N17, N16
print P0
print \n
print N17
print \n
end

# in: P0 is product, p1 is count
# out: P0 is new product
fact:
  gt P1, 1, doit
  ret
doit:
  mul P0, P0, P1
  dec P1
  bsr fact
  ret
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)

2004-11-17 Thread Dan Sugalski
At 5:08 PM -0500 11/17/04, Dan Sugalski wrote:
Chopping out the multiplication (since that's a not-insignificant 
amount of the runtime for the bsr/ret version) gives:

PIR:
real0m3.016s
user0m2.990s
sys 0m0.030s
bsr/ret
real0m0.344s
user0m0.340s
sys 0m0.010s
and with -Oc, for completeness:
real0m0.416s
user0m0.380s
sys 0m0.030s
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)

2004-11-17 Thread Patrick R. Michaud
On Wed, Nov 17, 2004 at 10:03:14PM +0100, Leopold Toetsch wrote:
 As already stated, I don't consider these as either light-weight nor 
 faster. Here is a benchmark.
 
 Below are 2 version of a recursive factorial program. fact(100) is 
 calculated 1000 times:
 
 PIR   1.1 s
 bsr/ret   2.4 s
 PIR/tailcall  0.2s
 
 Unoptimized Parrot, default i.e. slow run core.

Sure, but the bsr/ret in your version is making lots of saveall calls 
that I'd be avoiding.  Also, this code is saving pmc's (big ones at 
that) whereas I'll generally be pushing a few ints and maybe a string 
onto the stack.  So, rewriting the above for ints instead of PerlInts, 
changing the multiply op to add to stay within the range of ints, and 
removing the unneeded saves/restores for things that are being passed 
as parameters anyway (and doubling the count save/restore to make it 
somewhat closer to what I'd expect...):

[EMAIL PROTECTED] pmichaud]$ parrot pmfact.imc #PIR
500500
5.819842
[EMAIL PROTECTED] pmichaud]$ parrot pmfactbsr.imc  #bsr/ret
500500
2.010935

Please keep in mind that I'm a newcomer to Parrot, so it's entirely
possible that I'm made some invalid assumptions in my code that skew
these results (and I'll freely admit them if pointed out).
And I will admit that the PIR code is still impressive speed-wise
relative to what it is doing, but it's hard to ignore a 60% improvement.

Pm
.sub optc @IMMEDIATE
# TODO turn on -Oc
# print optc\n
.end
.sub _main @MAIN
.param pmc argv
.local int count, product
.local float start, end
count = 1000
.local int argc
argc = elements argv
if argc  2 goto def
$S0 = argv[1]
count = $S0
def:
.local int i
i = 0
start = time
.local int n
loop:
n = count
product = 1
product = _fact(product, n)
inc i
if i  1000 goto loop
end = time
end -= start
 print product
 print \n
print end
print \n
.end
.sub _fact
   .param int product
   .param int count
   if count  1 goto recurs
   .return (product)
recurs:
   product += count
   dec count
   product = _fact(product, count)
   .return (product)
.end

.sub _main @MAIN
.param pmc argv
.local int count, product
.local float start, end
count = 1000
.local int argc
argc = elements argv
if argc  2 goto def
$S0 = argv[1]
count = $S0
def:
.local int i
i = 0
start = time
.local int n
loop:
n = count
product = 1
save count
bsr fact
restore count
inc i
if i  1000 goto loop
end = time
end -= start
print product
print \n
print end
print \n
goto ex

fact:
if count  1 goto recurse
ret
recurse:
product += count
dec count
save count
save count
bsr fact
restore count
restore count
ret

ex:
.end



Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)

2004-11-17 Thread Patrick R. Michaud
On Wed, Nov 17, 2004 at 10:03:14PM +0100, Leopold Toetsch wrote:
 Dan Sugalski wrote:
 
 As already stated, I don't consider these as either light-weight nor 
 faster. Here is a benchmark.
 
 Below are 2 version of a recursive factorial program. fact(100) is 
 calculated 1000 times:
 
 PIR   1.1 s
 bsr/ret   2.4 s
 PIR/tailcall  0.2s
 
 Unoptimized Parrot, default i.e. slow run core.

BTW, it may be very possible for me to write the p6ge generator so 
that it can be switched between the PIR and bsr/ret calling conventions,
so we don't need to resolve this entirely now.  And we could then benchmark
the two against each other.

Pm



Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)

2004-11-17 Thread Nicholas Clark
On Wed, Nov 17, 2004 at 02:47:09PM -0700, Patrick R. Michaud wrote:

 BTW, it may be very possible for me to write the p6ge generator so 
 that it can be switched between the PIR and bsr/ret calling conventions,
 so we don't need to resolve this entirely now.  And we could then benchmark
 the two against each other.

Keeping the code that flexible would be very interesting. If you can
achieve this without much extra pain, I think that it would be worth it.

Nicholas Clark