Re: OO benches
At 11:19 AM +0100 4/17/04, Piers Cawley wrote: Leopold Toetsch <[EMAIL PROTECTED]> writes: Aaron Sherman <[EMAIL PROTECTED]> wrote: On Fri, 2004-04-16 at 18:18, Leopold Toetsch wrote: Sorry, I gave the wrong impression. I meant it looks suspiciously like Python is doing a lazy construction on those objects, not that there is anything wrong with the benchmark. No, I don't think that this is happening. Parrot's slightly slower object instantiation is due to register preserving mainly. The "__init" code is run from inside the "new PObj, IClass" opcode. As its not known that a method call is happening here, we can't use register preserving operations that only save needed registers--we have to save all registers. These two memcpys are the most heavy part of the operation. Maybe we should rethink that then and make allocation and initialization two different phases. That's the way I'm leaning. I know it's a *bad* idea from a high-level language point of view, but from the lower levels it's less of a bad idea. New, then, would allocate the object and you'd need to then call its constructor, with the constructor call using full-on parrot calling conventions and giving the calling code a chance to save the registers it was interested in. Of course, then we get into the issue of handling return values from multiple calls into methods as we automatically redispatch the constructor, but... -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: OO benches
Piers Cawley wrote: Leopold Toetsch <[EMAIL PROTECTED]> writes: These two memcpys are the most heavy part of the operation. Maybe we should rethink that then and make allocation and initialization two different phases. Or dictate that new PObj, IClass should be treated as if it were a function call with all the caller saves implications that go with it. Well, its not only object creation. While this is a bit special and could have a special syntax, the problem is with all delegate usage e.g. for tying. If we need some extra speed for object creation, we could define it as new PObj, IClass, "BUILD" # call sub in BUILD prop new PObj, IClass, "CONSTRUCT" # call sub in CONSTRUCT prop new PObj, IClass # no init call at all and just save needed registers, as we know, that a Sub is called (or not). But as said, it doesn't help here: $ time perl ff.pl 010 real0m3.287s $ time parrot -j ff.pasm 010 real0m2.334s leo :) ff.pl Description: Perl program newclass P1, "FF" addattribute P1, 'r' find_type I12, "FF" new P2, I12 set I10, 0 set I11, 50 loop: set I15, P2 inc I10 lt I10, I11, loop set I15, P2 print I15 set I15, P2 print I15 set I15, P2 print I15 print "\n" end .namespace ["FF"] .pcc_sub __init: classoffset I0, P2, 'FF' new P3, .PerlInt setattribute P2, I0, P3 invoke P1 .pcc_sub __get_integer: classoffset I0, P2, 'FF' getattribute P3, P2, I0 new P4, .PerlInt band P4, P3, 1 inc P3 if P4, r1 set I5, 0 invoke P1 r1: set I5, 1 invoke P1
Re: OO benches
Leopold Toetsch <[EMAIL PROTECTED]> writes: > Aaron Sherman <[EMAIL PROTECTED]> wrote: >> On Fri, 2004-04-16 at 18:18, Leopold Toetsch wrote: > >> Sorry, I gave the wrong impression. I meant it looks suspiciously like >> Python is doing a lazy construction on those objects, not that there is >> anything wrong with the benchmark. > > No, I don't think that this is happening. Parrot's slightly slower > object instantiation is due to register preserving mainly. The "__init" > code is run from inside the "new PObj, IClass" opcode. As its not known > that a method call is happening here, we can't use register preserving > operations that only save needed registers--we have to save all > registers. These two memcpys are the most heavy part of the operation. Maybe we should rethink that then and make allocation and initialization two different phases. Or dictate that new PObj, IClass should be treated as if it were a function call with all the caller saves implications that go with it.
Re: OO benches
Jeff Clites <[EMAIL PROTECTED]> wrote: > BTW, I'm failing a bunch of tests now (Mac OS X); not sure if it's > related: Fixed. It was caused by the faster PMC creation code I've put in earlier in the week, if ARENA_DOD_FLAGS is off (e.g. due to missing memalign). Thanks for reporting, leo
Re: OO benches
Jeff Clites <[EMAIL PROTECTED]> wrote: > BTW, I'm failing a bunch of tests now (Mac OS X); not sure if it's > related: > t/op/gc.NOK 11# Failed test (t/op/gc.t at line 219) > # got: 'get_pmc_keyed_str() not implemented in class > 'RetContinuation'' Have that now too - recompiled with ARENA_DOD_FLAGS turned off. The property hash got freed during DOD. I'm still searching why. leo
Re: OO benches
On Apr 16, 2004, at 11:19 PM, Jeff Clites wrote: BTW, I'm failing a bunch of tests now (Mac OS X); not sure if it's related: Failed Test Stat Wstat Total Fail Failed List of Failed --- t/op/gc.t 1 256131 7.69% 11 t/pmc/dumper.t 13 332813 13 100.00% 1-13 t/pmc/object-meths.t1 256191 5.26% 9 t/pmc/objects.t 7 1792377 18.92% 23-26 28 35-36 And of those, only these 2 fail if run without --gc-debug, _or_ if configured with --optimize (seems like an odd correlation): Failed TestStat Wstat Total Fail Failed List of Failed --- t/op/gc.t 1 256131 7.69% 11 t/pmc/dumper.t1 256131 7.69% 12 JEff
Re: OO benches
Jeff Clites <[EMAIL PROTECTED]> wrote: > On Apr 16, 2004, at 9:29 AM, Leopold Toetsch wrote: >> $ ./bench -b=^oo[234f] > Looks cool! Yep. > BTW, I'm failing a bunch of tests now (Mac OS X); not sure if it's > related: Strange. valgrind doesn't indicate any problem with these tests. > I'll poke a bit and see if I can figure out what's going on. Yes please. >> - constant strings e.g. "BUILD" get a precomputed hash value from >> c2str.pl > This isn't checked in yet, right? (Didn't see c2str.pl anywhere.) It was attached to yesterdays message "Constant strings - again". But I'll resend it with my recent changes WRT hashvalue precalculation. >> - use of _S("BUILD") and _S("CONSTRUCT") in objects.c > Mac OS X doesn't like the _S()--it seems it may already be defined to > something. How about something clearer (and less likely to conflict) > instead, like STRING_LITERAL()? We can undef it before using. STRING_LITERAL is more typing and doesn't assure uniqueness - so rather not. Maybe PSC() - Parrot String Constant. > JEff leo
Re: OO benches
Aaron Sherman <[EMAIL PROTECTED]> wrote: > On Fri, 2004-04-16 at 18:18, Leopold Toetsch wrote: > Sorry, I gave the wrong impression. I meant it looks suspiciously like > Python is doing a lazy construction on those objects, not that there is > anything wrong with the benchmark. No, I don't think that this is happening. Parrot's slightly slower object instantiation is due to register preserving mainly. The "__init" code is run from inside the "new PObj, IClass" opcode. As its not known that a method call is happening here, we can't use register preserving operations that only save needed registers--we have to save all registers. These two memcpys are the most heavy part of the operation. > Lazy construction is perhaps something Parrot should think about too, I can't imagine that lazy construction could be of any value. You have to construct it finally. Sum up the two parts. And 90% (or ~100 with gcc 3.3.3 on a Pentium) of Python's performance isn't that bad the more that Python AFAIK is constructing kind of a hash and we have a full fledged object. leo
Re: OO benches
On Apr 16, 2004, at 9:29 AM, Leopold Toetsch wrote: With all current optimizations[1] I now have these timings: $ ./bench -b=^oo[234f] Numbers are relative to the first one. (lower is better) p-j-Oc perl-th perlpython ruby oo2 100%182%152%90% 132% oo3 100%276%256%333%383% oo4 100%137%128%171%292% oofib 100%303%261%157%161% Looks cool! BTW, I'm failing a bunch of tests now (Mac OS X); not sure if it's related: Failed Test Stat Wstat Total Fail Failed List of Failed --- t/op/gc.t 1 256131 7.69% 11 t/pmc/dumper.t 13 332813 13 100.00% 1-13 t/pmc/object-meths.t1 256191 5.26% 9 t/pmc/objects.t 7 1792377 18.92% 23-26 28 35-36 The gc test is failing with: t/op/gc.NOK 11# Failed test (t/op/gc.t at line 219) # got: 'get_pmc_keyed_str() not implemented in class 'RetContinuation'' # expected: 'hello # hello # ' # '(cd . && ./parrot -b --gc-debug /tmp/gc_11.pasm)' failed with exit code 2 and all of the dumper ones look like double-frees: t/pmc/dumperNOK 7# Failed test (t/pmc/dumper.t at line 359) # got: '*** malloc[9416]: Deallocation of a pointer not malloced: 0x200ee30; This could be a double free(), or free() called with the middle of an allocated block I'll poke a bit and see if I can figure out what's going on. - constant strings e.g. "BUILD" get a precomputed hash value from c2str.pl This isn't checked in yet, right? (Didn't see c2str.pl anywhere.) - use of _S("BUILD") and _S("CONSTRUCT") in objects.c Mac OS X doesn't like the _S()--it seems it may already be defined to something. How about something clearer (and less likely to conflict) instead, like STRING_LITERAL()? JEff
Re: OO benches
On Fri, 2004-04-16 at 18:18, Leopold Toetsch wrote: > Aaron Sherman <[EMAIL PROTECTED]> wrote: > > > That looks suspicious... especially Python. > > You have the sources in examples/benchmarks. Maybe we are comparing > apples and oranges. But the code looks good to me. Sorry, I gave the wrong impression. I meant it looks suspiciously like Python is doing a lazy construction on those objects, not that there is anything wrong with the benchmark. Lazy construction is perhaps something Parrot should think about too, though I've not looked into what Parrot does now. How often would it be of any value to construct a stub object which still needed to be fully constructed before use? Do such objects pop into existence implicitly in any commonly-used places that would yield a performance win in the general case? Just wondering why Python would do that (if, indeed it is doing that). -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: OO benches
Aaron Sherman <[EMAIL PROTECTED]> wrote: > That looks suspicious... especially Python. You have the sources in examples/benchmarks. Maybe we are comparing apples and oranges. But the code looks good to me. > I would suggest using iterations that go much longer so that you can > detect over-optimizations and such more easily. More benchmarks welcome. > Very nice! leo
Re: OO benches
On Fri, 2004-04-16 at 12:29, Leopold Toetsch wrote: > With all current optimizations[1] I now have these timings: > > $ ./bench -b=^oo[234f] > Numbers are relative to the first one. (lower is better) > p-j-Oc perl-th perlpython ruby > oo2 100%182%152%90% 132% > oo3 100%276%256%333%383% That looks suspicious... especially Python. It smells there's some lazy evaluation going on here, and that object doesn't get fully instantiated until oo3. I suspect, in that light, that the numbers aren't quite as bad for Parrot as they look in oo2, nor as good for Parrot as the look in oo3 (well, maybe as good as they look, but not as bad... I have to think about that). > $ time CALL__BUILD=1 parrot -j oo2b.pasm > real0m2.566s > vs 0m2.630s for oo2.pasm > (w.o any of these optimizations oo2b takes 3.9s) I would suggest using iterations that go much longer so that you can detect over-optimizations and such more easily. Very nice! -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
OO benches
With all current optimizations[1] I now have these timings: $ ./bench -b=^oo[234f] Numbers are relative to the first one. (lower is better) p-j-Oc perl-th perlpython ruby oo2 100%182%152%90% 132% oo3 100%276%256%333%383% oo4 100%137%128%171%292% oofib 100%303%261%157%161% And $ time CALL__BUILD=1 parrot -j oo2b.pasm real0m2.566s vs 0m2.630s for oo2.pasm (w.o any of these optimizations oo2b takes 3.9s) oo2 is basically object instantiation, oo2b calls the method in the BUILD property, oo2 calls __init directly. oo3 is attribute get oo4 is attribute set (where Parrot creates new PMCs, which isn't really needed :) oofib tests mostly function/method call speed [1] - set_string_native references the string - constant strings e.g. "BUILD" get a precomputed hash value from c2str.pl - use of _S("BUILD") and _S("CONSTRUCT") in objects.c Athlon 800 Parrot -O3 (gcc 2.92.2) perl-th is threaded 5.8 perl is 5.8 with long double support python 2.3.3 ruby 1.8.0