Dan Sugalski <[EMAIL PROTECTED]> wrote: > Anyway, so much for the 'outside world' view of objects as black box > things that have properties and methods.
[ ... ] > Almost everything we do here is going to be with method calls. > There's very little that I can see that requires any faster access > than that, so as long as we have a proper protocol that everyone can > conform to, we should be OK. The distinction object vs class is still in the way a bit, when it comes to method calls. $ python ... >>> i = 4 >>> i.__sub__(1) 3 >>> help(int) ... | __sub__(...) | x.__sub__(y) <==> x-y Given that a PyInt is just a PMC it currently needs a lot of ugly hacks to implement the full set of operator methods. dynclasses/py*.pmc is mostly just duplicating existing code from standard PMCs. Iff the assembler just emits a method call for the opcode: sub P0, P1, P2 we have that equivalence too, with proper inheritance and static and dynamic operator overloading. I've duplicated FixedPMCArray.get_pmc_keyed_int() as a method, starting with this line: METHOD PMC* __getitem__(INTVAL key) { and measured 5 M array lookups: $ parrot -C vt-bench.pasm Vtable get_pmc_keyed_int 0.32 s __getitem__ 0.36 s The actual code in the loop is: lp2: callmethodcc "__getitem__" inc I16 lt I16, 5000000, lp2 Our internal name for this method is "__get_pmc_keyed_int" but that doesn't really matter. Either the Python translator emits a proper name or the method gets aliased at runtime. The timing from above is again done with the CGP core and the method is cached in the PIC (polymorphic inline cache). This is the relevant part of the executed opcode: typedef PMC* (*func_pi)(Interp*, PMC*, INTVAL); func_pi f; pic = (Parrot_PIC *) cur_opcode[1]; idx = REG_INT(5); self = REG_PMC(2); if (self->vtable->base_type == pic->lru[0].lr_type) { f = (func_pi)pic->lru[0].f.real_function; REG_PMC(5) = (f)(interpreter, self, idx); goto *((void*)*(cur_opcode += 2)); } The PIC has three cache slots. Literature states that the fast path is hit for 94% of the cases. If the object of that method call at that bytecode location changes, another 5% is executed in a second C<if> block. If the type match fails or for the first time VTABLE_find_method is called to get the function pointer. When the find_method returns a subroutine object a different opcode is executed that calls the overloaded method. Overloaded methods are *not* executed in a secondary runloop anymore. This improves performance by about 50%. I've already shown that these scheme also nicely fits for function-like opcodes (math.sin() and such). I think it's worth some more consideration. leo