Re: Python PMC's

Leopold Toetsch Wed, 24 Aug 2005 14:28:00 -0700


On Aug 24, 2005, at 19:45, Sam Ruby wrote:


Leopold Toetsch wrote:

Sam Ruby wrote:

The return value is a callable sub.


More precisely: a curried function call.  This is an important
distinction; to see why, see below.


A callable sub may be of course a curried one - yes.

The interesting optimizations are:
- cache the find_method in the global method cache.
  This happens already, if the method string is constant.


Note that you would then be caching the results of a curried function
call.  This result depends not only on the method string, but also on
the particular object upon which it was invoked.

No the "inner" Parrot_find_method_with_cache just caches the method fora specific class (obeying C3 method resolution order as in Python).There is no curried function at that stage.

- use a PIC [1] (which is tied to the opcode) and cache
  the final resut of find_method (or any similar lookup)


Again, the results will depends on the object.

Yes. The nice thing with a PIC is that it is per bytecode location. Youhave a different PIC and a different cache for every call site. Theprologue of PIC opcode is basically:


  if cache.version == interpreter.version:
     (cache.function)(args)
  else:
     # do dynamic lookup
     # update cache then repeat

The 'version' compare depends on the cached thingy, and is moreexplicit in individual implementations. But the principle remainsalways the same: you create a unique id that depends on the variablesof the lookup and remember it. Before invoking the cached result youcompare actual with cached ids. If there is a cache miss, there are 2or 3 more cache slots to consult before doing just the dynamic originalscheme again (and maybe rewrite the opcode again to just the dynamicone in case of too many cache misses).

src/pic.c and ops/pic.ops have an implementation for sub Px, Py - justfor fun and profit to show the performance in the MOPS benchmark ;-)


The important part in ops/pic.ops is:

lr_types = (left->vtable->base_type << 16) |right->vtable->base_type;

    if (lru->lr_type == lr_types) {
runit_v_pp:
        ((mmd_f_v_pp)lru->f.real_function)(interpreter, left, right);

(the lru is part of the cache structure, above code will be only run,when both types are <= 0xffff)

MMD depends on the two involved types. This is compared before callingthe cached function directly. As the cache is per bytecode location,there is a fair chance (>95 %) that the involved types for this verycode location are matching.

The same is true for plain method calls. The callee depends on theinvocant and the method name, which is usually a constant. Thereforeyou can compare the cached version with the actual invocant type andnormally, with a match, just run the cached function immediately.

Currying is only important for placing the 'self' into the arguments -the actual lookup was already done earlier and doesn't influence thecalled function. Actually a curried subroutine isa 'Sub' object alreadyand invoked directly withouth any further method lookup. There is nocaching involved in the call of a curried sub.

- run the invoked Sub inside the same run loop - specifically
  not within Parrot_run_meth_fromc_args


I don't understand this.  (Note: it may not be important to this
discussion that I do understand this - all that is important to me is

that it works, and somehow I doubt that Parrot_run_meth_fromc_argscares

whether a given function is curried or not).


There are two points to be considered:

- currying: the effect is that some call arguments (in this specialcase the object) are already fixed. The argument passing code hastherefore the duty to insert these known (and remembered) argumentsinto the params for the callee. For the BoundMethod this is of course,shift all arguments up by one, and make the object 'self' the firstparam of he sub.- the second point is only related to call speed aka optimization. It'sjust faster to run a PIR sub in the same run loop, then to create a newrun loop.

The above works because PyInt is a constant.  It probably can be
extended to handle things that seem unlikely to change very rapidly.

Yes. That's the 'trick' behind PIC. It works best the more constant theitems are. But as said above, method names and invocants usually don'tvary *per byecode location*. Literature states ~95 % of method callsare monomorphic (one type of invocant), and 99,5 % are cached within 4cache slots. Look at some typical code


   a.'foo'(x)
   ...
   b.'bar'(y)

Both method calls have a distinct cache. The method names are constant.Therefore the callee depends only on the invocant. The types of 'a' or'b' are typically the same (except maybe inside compilers AST visitmethods or some such). The same schme applies to plain method orattribute lookups.

But the combination of decisions on how to handle the passing of the
"self" parameter to a method, keeping find_method and invoke separated
at the VTABLE level, and the semantics of Python make the notion of
caching the results of find_method problematic.

Hmm. Ad 'self': this matches exactly Python or Perl callingconventions. As you have shown in your example you can call a Python"method" (which actually is just a function) with method or functioncall syntax.


  o.foo(x)
  foo(o, x)

are basically the same, when it comes to argument passing (I don't talkabout locating 'foo' here).

This is exactly what is implemented now in Parrot. I think this is animprovement over the old state.

A distinct find_method_then_invoke vtable is a replica of thecallmethodcc opcode and more a matter of optimization then anythingelse, because you can already now change the behavior of both parts.

- Sam Ruby

leo

Re: Python PMC's

Reply via email to