Charles Oliver Nutter wrote:
So I'm trying to reconcile the situation. I certainly don't want to lose
MethodCache performance, but I need to have the callback for inline
caching to be efficient. The cache flush has to be triggered outside the
call path, rather than having an additional guard in the code.
I've attached a patch that is a sort of temporary compromise between the
direct inline cache and using MethodCache. Basically it just adds the
logic to look up a method in MethodCache directly into the CallAdapter,
allowing the eventual DynamicMethod invocation to still happen in the
adapter (but adding a bunch of logic around it. The numbers are pretty
good, but still not as fast as a clean inline cache with a simple guard:
methodcache version:
~/NetBeansProjects/jruby $ jruby -J-server
test/bench/bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
2.229000 0.000000 2.229000 ( 2.229000)
2.328000 0.000000 2.328000 ( 2.328000)
1.468000 0.000000 1.468000 ( 1.468000)
1.463000 0.000000 1.463000 ( 1.463000)
1.472000 0.000000 1.472000 ( 1.472000)
1.471000 0.000000 1.471000 ( 1.472000)
1.460000 0.000000 1.460000 ( 1.460000)
1.468000 0.000000 1.468000 ( 1.468000)
1.473000 0.000000 1.473000 ( 1.474000)
1.469000 0.000000 1.469000 ( 1.469000)
This patch also still doesn't do STI dispatch.
I'm going to think on these things for a bit and try to reconcile them.
Basically, there's the following dispatch characteristics we need to
find a way to combine so the call adapter is as tight as possible:
Dispatch mechanisms:
+ STI dispatch (or otherwise direct-calling the target Java code)
+ MethodCache logic
+ inline caching of DynamicMethod objects
+ long, slow path of looking up method in hierarchy, caching it, and
calling it
- polymorphism in caching (future)
- method_missing logic in correct place during call adaptation
Framing and scoping:
+ no framing or scoping for STI and other "fast" calls
+ frame only for non-fast Java-bound methods
+ frame and scope for all others
+ potentially no scope when AOT or JIT profiling can show that scope is
not needed
- moving the framing/scoping logic higher up the call path so it can be
profiled and optimized along with the actual call (rather than requiring
deep call stacks to generify framing/scoping)
Method signatures:
+ exact arity calls
+ calls with or without block depending on whether block is given
- exactly typed calls (future)
Binding mechanisms:
+ binding one Java method to one Ruby name (current, easiest, slowest)
+ binding arity-specific Java methods to one Ruby name
- binding arity and type-specific Java methods to one Ruby name
(the last two will give the largest gains if the entire call chain from
call site to call target can propagate the arity and types)
Up to now we've been able to demonstrate, in either production or
experimental forms, all of the "+" items above. But only a few of them
have been used in combination. But putting all the above items in place
together, working in concert, I think we will be close to our
theoretical "fastest possible" performance for JRuby on JVM. I don't
know how fast that will be.
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email