Jochen Theodorou wrote:
> ok, let me try to explain what I think of... The current system in 
> Groovy works like this: you have a narrow API, with some core that 
> actually selects and executes the method call. Ng is more or less the 
> same, but with a wide API. What I plan for the future is not no longer 
> let the core execute the methods, instead they return handles to the 
> call site and the call site will call the method for us.

Yes, I recall the discussions when this was implemented on trunk. And 
from my own tests, it definitely had improved performance, but I haven't 
done a wide range of testing (as I'm sure you have, e.g. grails and 
otherwise). One concern that occurs to me is how this affects the 
locality of the call site. Where in JRuby, the call site is never more 
than a field access away, in Groovy it's retrieved from the same long 
pipeline. So that pipeline has to be doing some amount of "getting in 
the way" even if the call site encapsulates and eliminates a certain 
portion of it. Or am I misunderstanding? This doesn't seem as much like 
a call site optimization as simply currying a portion of the lookup 
process into an object you then cache at the metaclass level for future 
calls (and removing if there are changes).

Perhaps a stack trace of a typical call through one of your "call sites" 
would help illustrate the effect better?

> This design is very much oriented at invokedynamic, but we came up with 
> this before invokednymic. Of course MethodHandles, such as described by 
> John Rose will come in very handy here. Most of what can be done today 
> with monkey patching and categories fits well in his new way. I plan 
> also to restrict a MetaClass to be no longer replaceable, but mutating 
> it is allowed. The downside of this is, that if you want for example 
> write code that reacts to each method call, that you have to put that in 
> a MetaMethod. But much of what is done today will work without change I 
> think.

That is a *big* change for the language, I think, but in my opinion a 
very good one (and of course we've talked about this in the past). I 
believe that Groovy's ability to not only replace methods (EMC) and 
install categories, but to also wholesale replace metaclasses with 
custom implementations, often implemented in Groovy themselves, is a 
major barrier to optimization. I don't see the value in categories 
myself, so I won't go there. But in my opinion EMC should be the only 
MC, enabled by default everywhere, with ruby-like hooks to augment its 
behavior and no option for replacement. Then you're in a far better 
position to install more optimistic optimizations.

> I think this approach will allow a narrow API, with the core selecting 
> the method, but not executing them. The actual call structure will be 
> shallow and caching can be done at lots of places
> 
> We plan on doing so too.. But only for a few cases that can be expected. 
> In fact in Groovy the user can give type information, so if he does we 
> can use that to predict methods and their result types. I plan such 
> actions also for calls to private methods. This way the bytecode won't 
> be that bloated

You'd be surprised. How big does is a typical Groovy method in bytecode 
right now? I'd wager a substantial portion of that is call 
overhead...can you afford to double the size of some subset of operations?

Here's a simple JRuby fib method, minus about 15 bytecodes worth of 
preamble:

public org.jruby.runtime.builtin.IRubyObject 
method__0$RUBY$fib_ruby(org.jruby.runtime.ThreadContext, 
org.jruby.runtime.builtin.IRubyObject, 
org.jruby.runtime.builtin.IRubyObject[], org.jruby.runtime.Block);
   Code:
.... preamble ....
    45: aload_1
    46: iconst_3
    47: invokestatic    #40; //Method 
setPosition:(Lorg/jruby/runtime/ThreadContext;I)V
    50: aload_0
    51: getfield        #89; //Field site1:Lorg/jruby/runtime/CallSite;
    54: aload_1
    55: aload   11
    57: aload   6
    59: invokestatic    #95; //Method 
org/jruby/RubyFixnum.two:(Lorg/jruby/Ruby;)Lorg/jruby/RubyFixnum;
    62: invokevirtual   #74; //Method 
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
    65: invokeinterface #101,  1; //InterfaceMethod 
org/jruby/runtime/builtin/IRubyObject.isTrue:()Z
    70: ifeq    83
    73: aload_1
    74: iconst_4
    75: invokestatic    #40; //Method 
setPosition:(Lorg/jruby/runtime/ThreadContext;I)V
    78: aload   11
    80: goto    145
    83: aload_1
    84: bipush  6
    86: invokestatic    #40; //Method 
setPosition:(Lorg/jruby/runtime/ThreadContext;I)V
    89: aload_0
    90: getfield        #106; //Field site2:Lorg/jruby/runtime/CallSite;
    93: aload_1
    94: aload_0
    95: getfield        #111; //Field site3:Lorg/jruby/runtime/CallSite;
    98: aload_1
    99: aload_2
    100:        aload_0
    101:        getfield        #116; //Field site4:Lorg/jruby/runtime/CallSite;
    104:        aload_1
    105:        aload   11
    107:        aload   6
    109:        invokestatic    #95; //Method 
org/jruby/RubyFixnum.two:(Lorg/jruby/Ruby;)Lorg/jruby/RubyFixnum;
    112:        invokevirtual   #74; //Method 
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
    115:        invokevirtual   #74; //Method 
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
    118:        aload_0
    119:        getfield        #119; //Field site5:Lorg/jruby/runtime/CallSite;
    122:        aload_1
    123:        aload_2
    124:        aload_0
    125:        getfield        #122; //Field site6:Lorg/jruby/runtime/CallSite;
    128:        aload_1
    129:        aload   11
    131:        aload   6
    133:        invokestatic    #125; //Method 
org/jruby/RubyFixnum.one:(Lorg/jruby/Ruby;)Lorg/jruby/RubyFixnum;
    136:        invokevirtual   #74; //Method 
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
    139:        invokevirtual   #74; //Method 
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
    142:        invokevirtual   #74; //Method 
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
    145:        areturn

Now this bytecode is pretty tight. There are some special-case methods 
for Fixnum 1 and 2, CallSite objects to encapsulate some boilerplate 
call-wrapping logic, and "setPosition" calls to update the Ruby stack 
trace, but otherwise we've managed to boil it down a lot. And it's still 
a lot of code. I've been doing a bytecode audit recently to make sure 
all bytecode generated is as clean as possible, and this is the result 
at the moment (trunk code). What's a comparable fib method in Groovy 
look like with the new call site stuff?

Of course I'm not saying to go for it...I'm going to try do the same 
thing with profiling data gathered during interpretation, if I can find 
a reasonable way to shrink the bytecode duplication to a reasonable 
level. But I think tricks that depend on type annotations are really not 
in the spirit of the language...and if possible I would help you explore 
ways to optimize normal dynamic invocation more first, because I think 
that's where the most generally applicable gains are going to come from.

> to say the truth, Groovy is fast enough for me, even if it is sometimes 
> 5-100 times slower than Java. It is quite easy to get the speed very 
> much up. But a language is not only about what the implementors want and 
> a community driven language like Groovy especially not. Groovy is no 
> academic language where you write papers when you have a good idea. 
> Instead a language is also much about politics, and if the public 
> demands more speed, then we will do our best. Also here are people 
> afraid of dynamic languages and we need o show them, that they don't 
> need to be slow, just because they are dynamic
...
> Well, in a benchmark like the Alioth Shootout you are not allowed to use 
> this obvious solution. That gives bad press. And since a language is so 
> much about politics, you have to handle bad press somehow

I hate having to worry about performance, but I love optimizing it. The 
world is far too performance obsessed, but there are reasons for it. I 
would strongly caution against optimizations designed to make specific 
benchmarks fast, even if the political gains would be substantial. Ruby 
1.9 added fast-path Fixnum math operators and ended up looking great on 
a lot of benchmarks. Then more and more complaints started to come in 
that they resulted in slowing down *everything non-Fixnum type* because 
of the extra typechecking involved.

- Charlie

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to