Charles Oliver Nutter wrote:
So I'm trying to reconcile the situation. I certainly don't want to lose MethodCache performance, but I need to have the callback for inline caching to be efficient. The cache flush has to be triggered outside the call path, rather than having an additional guard in the code.

I've attached a patch that is a sort of temporary compromise between the direct inline cache and using MethodCache. Basically it just adds the logic to look up a method in MethodCache directly into the CallAdapter, allowing the eventual DynamicMethod invocation to still happen in the adapter (but adding a bunch of logic around it. The numbers are pretty good, but still not as fast as a clean inline cache with a simple guard:

methodcache version:
~/NetBeansProjects/jruby $ jruby -J-server test/bench/bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
  2.229000   0.000000   2.229000 (  2.229000)
  2.328000   0.000000   2.328000 (  2.328000)
  1.468000   0.000000   1.468000 (  1.468000)
  1.463000   0.000000   1.463000 (  1.463000)
  1.472000   0.000000   1.472000 (  1.472000)
  1.471000   0.000000   1.471000 (  1.472000)
  1.460000   0.000000   1.460000 (  1.460000)
  1.468000   0.000000   1.468000 (  1.468000)
  1.473000   0.000000   1.473000 (  1.474000)
  1.469000   0.000000   1.469000 (  1.469000)

This patch also still doesn't do STI dispatch.

I'm going to think on these things for a bit and try to reconcile them. Basically, there's the following dispatch characteristics we need to find a way to combine so the call adapter is as tight as possible:

Dispatch mechanisms:
+ STI dispatch (or otherwise direct-calling the target Java code)
+ MethodCache logic
+ inline caching of DynamicMethod objects
+ long, slow path of looking up method in hierarchy, caching it, and calling it
- polymorphism in caching (future)
- method_missing logic in correct place during call adaptation

Framing and scoping:
+ no framing or scoping for STI and other "fast" calls
+ frame only for non-fast Java-bound methods
+ frame and scope for all others
+ potentially no scope when AOT or JIT profiling can show that scope is not needed - moving the framing/scoping logic higher up the call path so it can be profiled and optimized along with the actual call (rather than requiring deep call stacks to generify framing/scoping)

Method signatures:
+ exact arity calls
+ calls with or without block depending on whether block is given
- exactly typed calls (future)

Binding mechanisms:
+ binding one Java method to one Ruby name (current, easiest, slowest)
+ binding arity-specific Java methods to one Ruby name
- binding arity and type-specific Java methods to one Ruby name
(the last two will give the largest gains if the entire call chain from call site to call target can propagate the arity and types)

Up to now we've been able to demonstrate, in either production or experimental forms, all of the "+" items above. But only a few of them have been used in combination. But putting all the above items in place together, working in concert, I think we will be close to our theoretical "fastest possible" performance for JRuby on JVM. I don't know how fast that will be.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email

Reply via email to