Jochen Theodorou wrote:
> ok, let me try to explain what I think of... The current system in
> Groovy works like this: you have a narrow API, with some core that
> actually selects and executes the method call. Ng is more or less the
> same, but with a wide API. What I plan for the future is not no longer
> let the core execute the methods, instead they return handles to the
> call site and the call site will call the method for us.
Yes, I recall the discussions when this was implemented on trunk. And
from my own tests, it definitely had improved performance, but I haven't
done a wide range of testing (as I'm sure you have, e.g. grails and
otherwise). One concern that occurs to me is how this affects the
locality of the call site. Where in JRuby, the call site is never more
than a field access away, in Groovy it's retrieved from the same long
pipeline. So that pipeline has to be doing some amount of "getting in
the way" even if the call site encapsulates and eliminates a certain
portion of it. Or am I misunderstanding? This doesn't seem as much like
a call site optimization as simply currying a portion of the lookup
process into an object you then cache at the metaclass level for future
calls (and removing if there are changes).
Perhaps a stack trace of a typical call through one of your "call sites"
would help illustrate the effect better?
> This design is very much oriented at invokedynamic, but we came up with
> this before invokednymic. Of course MethodHandles, such as described by
> John Rose will come in very handy here. Most of what can be done today
> with monkey patching and categories fits well in his new way. I plan
> also to restrict a MetaClass to be no longer replaceable, but mutating
> it is allowed. The downside of this is, that if you want for example
> write code that reacts to each method call, that you have to put that in
> a MetaMethod. But much of what is done today will work without change I
> think.
That is a *big* change for the language, I think, but in my opinion a
very good one (and of course we've talked about this in the past). I
believe that Groovy's ability to not only replace methods (EMC) and
install categories, but to also wholesale replace metaclasses with
custom implementations, often implemented in Groovy themselves, is a
major barrier to optimization. I don't see the value in categories
myself, so I won't go there. But in my opinion EMC should be the only
MC, enabled by default everywhere, with ruby-like hooks to augment its
behavior and no option for replacement. Then you're in a far better
position to install more optimistic optimizations.
> I think this approach will allow a narrow API, with the core selecting
> the method, but not executing them. The actual call structure will be
> shallow and caching can be done at lots of places
>
> We plan on doing so too.. But only for a few cases that can be expected.
> In fact in Groovy the user can give type information, so if he does we
> can use that to predict methods and their result types. I plan such
> actions also for calls to private methods. This way the bytecode won't
> be that bloated
You'd be surprised. How big does is a typical Groovy method in bytecode
right now? I'd wager a substantial portion of that is call
overhead...can you afford to double the size of some subset of operations?
Here's a simple JRuby fib method, minus about 15 bytecodes worth of
preamble:
public org.jruby.runtime.builtin.IRubyObject
method__0$RUBY$fib_ruby(org.jruby.runtime.ThreadContext,
org.jruby.runtime.builtin.IRubyObject,
org.jruby.runtime.builtin.IRubyObject[], org.jruby.runtime.Block);
Code:
.... preamble ....
45: aload_1
46: iconst_3
47: invokestatic #40; //Method
setPosition:(Lorg/jruby/runtime/ThreadContext;I)V
50: aload_0
51: getfield #89; //Field site1:Lorg/jruby/runtime/CallSite;
54: aload_1
55: aload 11
57: aload 6
59: invokestatic #95; //Method
org/jruby/RubyFixnum.two:(Lorg/jruby/Ruby;)Lorg/jruby/RubyFixnum;
62: invokevirtual #74; //Method
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
65: invokeinterface #101, 1; //InterfaceMethod
org/jruby/runtime/builtin/IRubyObject.isTrue:()Z
70: ifeq 83
73: aload_1
74: iconst_4
75: invokestatic #40; //Method
setPosition:(Lorg/jruby/runtime/ThreadContext;I)V
78: aload 11
80: goto 145
83: aload_1
84: bipush 6
86: invokestatic #40; //Method
setPosition:(Lorg/jruby/runtime/ThreadContext;I)V
89: aload_0
90: getfield #106; //Field site2:Lorg/jruby/runtime/CallSite;
93: aload_1
94: aload_0
95: getfield #111; //Field site3:Lorg/jruby/runtime/CallSite;
98: aload_1
99: aload_2
100: aload_0
101: getfield #116; //Field site4:Lorg/jruby/runtime/CallSite;
104: aload_1
105: aload 11
107: aload 6
109: invokestatic #95; //Method
org/jruby/RubyFixnum.two:(Lorg/jruby/Ruby;)Lorg/jruby/RubyFixnum;
112: invokevirtual #74; //Method
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
115: invokevirtual #74; //Method
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
118: aload_0
119: getfield #119; //Field site5:Lorg/jruby/runtime/CallSite;
122: aload_1
123: aload_2
124: aload_0
125: getfield #122; //Field site6:Lorg/jruby/runtime/CallSite;
128: aload_1
129: aload 11
131: aload 6
133: invokestatic #125; //Method
org/jruby/RubyFixnum.one:(Lorg/jruby/Ruby;)Lorg/jruby/RubyFixnum;
136: invokevirtual #74; //Method
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
139: invokevirtual #74; //Method
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
142: invokevirtual #74; //Method
org/jruby/runtime/CallSite.call:(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
145: areturn
Now this bytecode is pretty tight. There are some special-case methods
for Fixnum 1 and 2, CallSite objects to encapsulate some boilerplate
call-wrapping logic, and "setPosition" calls to update the Ruby stack
trace, but otherwise we've managed to boil it down a lot. And it's still
a lot of code. I've been doing a bytecode audit recently to make sure
all bytecode generated is as clean as possible, and this is the result
at the moment (trunk code). What's a comparable fib method in Groovy
look like with the new call site stuff?
Of course I'm not saying to go for it...I'm going to try do the same
thing with profiling data gathered during interpretation, if I can find
a reasonable way to shrink the bytecode duplication to a reasonable
level. But I think tricks that depend on type annotations are really not
in the spirit of the language...and if possible I would help you explore
ways to optimize normal dynamic invocation more first, because I think
that's where the most generally applicable gains are going to come from.
> to say the truth, Groovy is fast enough for me, even if it is sometimes
> 5-100 times slower than Java. It is quite easy to get the speed very
> much up. But a language is not only about what the implementors want and
> a community driven language like Groovy especially not. Groovy is no
> academic language where you write papers when you have a good idea.
> Instead a language is also much about politics, and if the public
> demands more speed, then we will do our best. Also here are people
> afraid of dynamic languages and we need o show them, that they don't
> need to be slow, just because they are dynamic
...
> Well, in a benchmark like the Alioth Shootout you are not allowed to use
> this obvious solution. That gives bad press. And since a language is so
> much about politics, you have to handle bad press somehow
I hate having to worry about performance, but I love optimizing it. The
world is far too performance obsessed, but there are reasons for it. I
would strongly caution against optimizations designed to make specific
benchmarks fast, even if the political gains would be substantial. Ruby
1.9 added fast-path Fixnum math operators and ended up looking great on
a lot of benchmarks. Then more and more complaints started to come in
that they resulted in slowing down *everything non-Fixnum type* because
of the extra typechecking involved.
- Charlie
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---