On Wed, Jul 22, 2009 at 12:11 AM, Subramanya Sastry<[email protected]> wrote:
>> So for example, if we know we're doing a call that is likely to access
>> the caller's frame, like "public" or "private", the IR could also
>> include information about preparing a frame or ensuring a frame has
>> already been prepared. This could allow us to lazily stand up those
>> structures only when needed, and potentially only stand up the parts
>> we really want (like preparing a frame-local "visibility" slot on some
>> threadlocal that could then be used by the subsequent calls).
>
> Makes sense. By frame, are you referring to the standard stack call frame,
> or is it some other heap structure specific to the implementation? I
> presume the latter.
We really have the call frame split in two right now. One half
contains slots for all the not-directly-accessible data like
visibility, caller's file and line number, and so on, and is contained
in org.jruby.runtime.Frame. The other half is for normal local
variables, and is contained in org.jruby.runtime.DynamicScope and its
subclasses, which specialize to various sizes of scopes to avoid the
array boundschecking as much as possible.
Both are managed on artificial stacks on ThreadContext, which is
passed through almost all calls in the system. They could be further
divided and specialized if we don't introduce additional artificial
stack overhead and receive a net-gain for common cases, like if we had
specialized logic that did not initialize the entire frame or only
initialized visibility or what have you.
> After whatever analyses we choose to perform on the current high level IR
> code, the high-level call instruction can be converted to a lower level IR
> where some of these details are made explicit. I need to better understand
> the current call protocol with all the boxing and wrapping that is involved
> to comment on this in greater detail. But yes, it should be possible to
> reduce some of these overheads. For example, you could have different
> flavors of call instructions depending on whether the call target is
> statically known or not, whether an inline cache is needed or not. By
> making explicit method lookups, you can eliminate duplicate method table
> loads (assuming objects have pointers to their method tables).
I think that's all possible. There's a lot of intangible overhead
currently not represented by the AST or considered by the compiler,
such as repeated method lookups, repeated type checks (possibly of
values that have not changed), repeated loads of local variables from
a heap-based store that have not been mutated, thread event pings, and
so on. By producing a low-level IR with all those operations
represented, I'm sure we can eliminate a lot of them while
simultaneously making it a lot easier to compile.
This will also require more help from me and Tom to explain what's
actually happening and work with you to produce an appropriate
low-level IR that accurately represents all this hidden overhead. It
shall be done!
>
> Consider this:
>
> o.m1(..)
> o.m2(..)
>
> Since type of o hasn't changed between the 2 calls, you can skip the method
> table load for the second call. Anyway, I need to understand the call
> protocol in greater detail to comment more.
In a rough pseudo-code, the basic inline-cached dyncall looks like this:
get o.class.token
load cached_token
ifeq go to cached_call
load o.class
call searchMethod("m1")
cache method
cache o.class.token
cached_call: call method on o and arguments
Most of this happens within InlineCachingCallSite outside of the
actual bytecode we generate, but calling through this code defeats
many optimizations including inlining. With invokedynamic Hotspot can
inline through our logic, but of course we want to have a solution
that works without invokedynamic. There is a backport of
invokedynamic, but it basically just dumbly inlines all that logic
right into the caller, and in our case it would increase the size of
the code tremendously, so it may not be an option. We'll have to
explore various options :)
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email