Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)
On Wed, Nov 17, 2004 at 10:35:47PM +, Nicholas Clark wrote: On Wed, Nov 17, 2004 at 02:47:09PM -0700, Patrick R. Michaud wrote: BTW, it may be very possible for me to write the p6ge generator so that it can be switched between the PIR and bsr/ret calling conventions, so we don't need to resolve this entirely now. And we could then benchmark the two against each other. Keeping the code that flexible would be very interesting. If you can achieve this without much extra pain, I think that it would be worth it. Believe it or not, I think it is less pain to do it this way (keeping the conventions switchable) -- it certainly cleans up the generator code a bit. We'll see how it works out. :-) Pm
Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)
At 10:03 PM +0100 11/17/04, Leopold Toetsch wrote: Dan Sugalski wrote: [ this came up WRT calling conventions ] I assume he's doing bsr/ret to get into and out of the sub, which is going to be significantly faster. Who says that? As already stated, I don't consider these as either light-weight nor faster. Here is a benchmark. Below are 2 version of a recursive factorial program. fact(100) is calculated 1000 times: PIR 1.1 s bsr/ret 2.4 s PIR/tailcall 0.2s Unoptimized Parrot, default i.e. slow run core. Way to go with the overkill. I'm impressed. However, written more sanely the results are: PIR: real0m4.149s user0m4.120s sys 0m0.030s bsr/ret: real0m1.266s user0m1.260s sys 0m0.000s Chopping out the multiplication (since that's a not-insignificant amount of the runtime for the bsr/ret version) gives: PIR: real0m3.016s user0m2.990s sys 0m0.030s bsr/ret real0m0.344s user0m0.340s sys 0m0.010s The bsr/ret version is: start: new P16, .PerlInt set P16, 1000 elements I16, P5 lt I16, 2, def set S0, P5[1] set P16, S0 def: set I16, 0 time N16 save N16 loop: clone P1, P16 new P0, .PerlInt set P0, 1 save P16 save I16 bsr fact restore I16 restore P16 inc I16 lt I16, 1000, loop restore N16 time N17 sub N17, N17, N16 print P0 print \n print N17 print \n end # in: P0 is product, p1 is count # out: P0 is new product fact: gt P1, 1, doit ret doit: mul P0, P0, P1 dec P1 bsr fact ret -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)
At 5:08 PM -0500 11/17/04, Dan Sugalski wrote: Chopping out the multiplication (since that's a not-insignificant amount of the runtime for the bsr/ret version) gives: PIR: real0m3.016s user0m2.990s sys 0m0.030s bsr/ret real0m0.344s user0m0.340s sys 0m0.010s and with -Oc, for completeness: real0m0.416s user0m0.380s sys 0m0.030s -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)
On Wed, Nov 17, 2004 at 10:03:14PM +0100, Leopold Toetsch wrote: As already stated, I don't consider these as either light-weight nor faster. Here is a benchmark. Below are 2 version of a recursive factorial program. fact(100) is calculated 1000 times: PIR 1.1 s bsr/ret 2.4 s PIR/tailcall 0.2s Unoptimized Parrot, default i.e. slow run core. Sure, but the bsr/ret in your version is making lots of saveall calls that I'd be avoiding. Also, this code is saving pmc's (big ones at that) whereas I'll generally be pushing a few ints and maybe a string onto the stack. So, rewriting the above for ints instead of PerlInts, changing the multiply op to add to stay within the range of ints, and removing the unneeded saves/restores for things that are being passed as parameters anyway (and doubling the count save/restore to make it somewhat closer to what I'd expect...): [EMAIL PROTECTED] pmichaud]$ parrot pmfact.imc #PIR 500500 5.819842 [EMAIL PROTECTED] pmichaud]$ parrot pmfactbsr.imc #bsr/ret 500500 2.010935 Please keep in mind that I'm a newcomer to Parrot, so it's entirely possible that I'm made some invalid assumptions in my code that skew these results (and I'll freely admit them if pointed out). And I will admit that the PIR code is still impressive speed-wise relative to what it is doing, but it's hard to ignore a 60% improvement. Pm .sub optc @IMMEDIATE # TODO turn on -Oc # print optc\n .end .sub _main @MAIN .param pmc argv .local int count, product .local float start, end count = 1000 .local int argc argc = elements argv if argc 2 goto def $S0 = argv[1] count = $S0 def: .local int i i = 0 start = time .local int n loop: n = count product = 1 product = _fact(product, n) inc i if i 1000 goto loop end = time end -= start print product print \n print end print \n .end .sub _fact .param int product .param int count if count 1 goto recurs .return (product) recurs: product += count dec count product = _fact(product, count) .return (product) .end .sub _main @MAIN .param pmc argv .local int count, product .local float start, end count = 1000 .local int argc argc = elements argv if argc 2 goto def $S0 = argv[1] count = $S0 def: .local int i i = 0 start = time .local int n loop: n = count product = 1 save count bsr fact restore count inc i if i 1000 goto loop end = time end -= start print product print \n print end print \n goto ex fact: if count 1 goto recurse ret recurse: product += count dec count save count save count bsr fact restore count restore count ret ex: .end
Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)
On Wed, Nov 17, 2004 at 10:03:14PM +0100, Leopold Toetsch wrote: Dan Sugalski wrote: As already stated, I don't consider these as either light-weight nor faster. Here is a benchmark. Below are 2 version of a recursive factorial program. fact(100) is calculated 1000 times: PIR 1.1 s bsr/ret 2.4 s PIR/tailcall 0.2s Unoptimized Parrot, default i.e. slow run core. BTW, it may be very possible for me to write the p6ge generator so that it can be switched between the PIR and bsr/ret calling conventions, so we don't need to resolve this entirely now. And we could then benchmark the two against each other. Pm
Re: light-weight calling conventions (was: Second cut at a P6 grammar engine, in Parrot)
On Wed, Nov 17, 2004 at 02:47:09PM -0700, Patrick R. Michaud wrote: BTW, it may be very possible for me to write the p6ge generator so that it can be switched between the PIR and bsr/ret calling conventions, so we don't need to resolve this entirely now. And we could then benchmark the two against each other. Keeping the code that flexible would be very interesting. If you can achieve this without much extra pain, I think that it would be worth it. Nicholas Clark