Re: [jruby-dev] Fwd: A bit more IR2JVM work on the flight home

Subramanya Sastry Thu, 26 Apr 2012 08:17:09 -0700

Thanks Charlie.  I'll try to move more conversations here as well.
Besides the interpreter and helping me understand jruby and ruby semantics,
Tom has been doing a lot of the work trying to keep the whole setup
organized/well-structured and has done a lot of refactoring of my first
pass, in some cases, not so clean code :)  But, over the last 8 months
especially, even when I have taken the lead, we have worked as a tag team
to iron out semantics, fill in holes, and get to correctness in 1.8 and 1.9
modes.


As Charlie said, we are now in the JIT cycle to get IR -> bytecode JIT in
place.  I should have a blog post soon (either on the EngineYard or the
JRuby blog depending on what works out best) that discusses the current
state of this project.

Subbu.

On Wed, Apr 25, 2012 at 9:13 PM, Charles Oliver Nutter
<head...@headius.com>wrote:

> Forwarding a conversation we've been having about IR2JVM work.
>
> The JRuby "IR" (Intermediate Representation) is our effort to build a
> completely new interpreter + compiler + runtime using a more optimizable
> structure. It currently can run RubySpecs green in 1.8 and 1.9 modes, which
> is a big milestone. I'm working on the new JIT that will go from IR to JVM
> bytecode, since the new JIT will be much simpler and we'll rely on IR
> manipulations to perform optimizations like constant propagation, dead code
> elimination, and so on. Subbu has been leading the effort with Tom doing a
> lot of work on interpretation and me slowly getting into compilation. Subbu
> has a blog post about IR coming soon.
>
> Anyway, you can read this conversation and we'll try to have future
> conversations about IR on this list too.
>
> - Charlie
>
> Forwarded conversation
> Subject: A bit more IR2JVM work on the flight home
> ------------------------
>
> From: *Charles Oliver Nutter* <head...@headius.com>
> Date: Sun, Apr 22, 2012 at 8:58 PM
> To: Subramanya Sastry <sss.li...@gmail.com>, Thomas Enebo <
> tom.en...@gmail.com>
>
>
> https://gist.github.com/2466877
>
> Good old fib again. -X+CIR is faster than -X+C but it's not doing all
> appropriate call site guarding. The main change I made was to get them
> binding "properly" with invokedynamic rather than binding to a slow
> lookup.
>
> So, +CIR is basically around as fast as +C at the moment (or a little
> faster because it's not 100% correct yet).
>
> - Charlie
>
> ----------
> From: *Charles Oliver Nutter* <head...@headius.com>
> Date: Sun, Apr 22, 2012 at 9:12 PM
> To: Subramanya Sastry <sss.li...@gmail.com>, Thomas Enebo <
> tom.en...@gmail.com>
>
>
> With appropriate guards added, it's just about identical perf to
> current compiler:
>
> system ~/projects/jruby $ time jruby -X+CIR -e "def fib(a); if a < 2;
> a; else; fib(a - 1) + fib(a - 2); end; end; p fib(35); p fib(35); p
> fib(35); p fib(35); p fib(35)"
> 9227465
> 9227465
> 9227465
> 9227465
> 9227465
>
> real    0m3.120s
> user    0m4.087s
> sys     0m0.113s
>
> - Charlie
>
> ----------
> From: *Charles Oliver Nutter* <head...@headius.com>
> Date: Sun, Apr 22, 2012 at 9:58 PM
> To: Subramanya Sastry <sss.li...@gmail.com>, Thomas Enebo <
> tom.en...@gmail.com>
>
>
> I started adding support for constants, but discovered that the IR has
> broken constant lookup into its component pieces. There's nothing
> wrong with this, per se, but for the JVM compiler I hid all that logic
> behind invokedynamic-based caching, and having it broken out in the IR
> means there's many instructions for what would be a single
> invokedynamic operation for me in the current compiler.
>
> I started implementing the pieces, and almost have it working, but I
> wanted to bring this up so I can learn about the gains from having the
> parts of constant lookup split out in the IR. What are they?
>
> - Charlie
>
> On Sun, Apr 22, 2012 at 11:12 PM, Charles Oliver Nutter
>
> ----------
> From: *Subramanya Sastry* <sss.li...@gmail.com>
> Date: Mon, Apr 23, 2012 at 12:42 AM
> To: Charles Oliver Nutter <head...@headius.com>
> Cc: Thomas Enebo <tom.en...@gmail.com>
>
>
>
> First of all, good to hear about the near identical performance when IR is
> compiled down.  So, at least for that microbenchmark, all info is getting
> through properly which is good to know.
>
> As for constant lookup, the primary reason I split it up was for the JIT,
> because it actually hurts the interpreter which wants fewer instructions.
> Well, to be honest, there were two other minor motivations which I'll get
> out of the way first since they are not as important.  (a) To more
> accurately reflect constant lookup semantics for every constant reference
> -- having explicit lexical search, inheritance search, and const missing
> calls exposes "primitive" operations via the IR rather than hiding them in
> the runtime (b) Some const lookups (colon2, colon3) only do inheritance
> lookup, and it seems like less work to do (though mostly insignificant) and
> also because it more accurately reflects what is going on as in (a).
> This could be a good debugging benefits for a Ruby programmer, for ex. to
> know why their const lookup failed.
>
> Now, the other reason why I split it is because sometimes you could
> optimize individual pieces of constant lookup -- for example, determine
> that lexical lookup will never/always succeed and ignore one or the other
> pieces and hardwire that result.   Ex: class C; def self.foo; 5; end; end;
> p C.foo .. this lookup of C will always succeed lexically.  You can imagine
> the other scenario.  A lot of constant lookups probably fall in these 2
> categories.  So, this is an opportunity for optimization (which hasn't been
> implemented yet) once we verify that we have the semantics right.
>
> On the other hand, it is true that you can hide the whole thing behind a
> indy call site -- but, it is possible to do the same with the individual
> pieces, but probably more runtime code.  With Dalvik without indy support,
> the story is a little different, and I think there are benefits from
> individual optimizable lookup pieces.
>
> Anyway, at this time, I am not sure if this is helping more or getting in
> the way more -- something we can investigate/experiment with.
>
> Subbu.
>
> ----------
> From: *Charles Oliver Nutter* <head...@headius.com>
> Date: Mon, Apr 23, 2012 at 3:29 AM
> To: Subramanya Sastry <sss.li...@gmail.com>
>
>
> On Mon, Apr 23, 2012 at 12:42 AM, Subramanya Sastry <sss.li...@gmail.com>
> wrote:
> >
> > First of all, good to hear about the near identical performance when IR
> is
> > compiled down.  So, at least for that microbenchmark, all info is getting
> > through properly which is good to know.
> >
> > As for constant lookup, the primary reason I split it up was for the JIT,
> > because it actually hurts the interpreter which wants fewer instructions.
> > Well, to be honest, there were two other minor motivations which I'll get
> > out of the way first since they are not as important.  (a) To more
> > accurately reflect constant lookup semantics for every constant
> reference --
> > having explicit lexical search, inheritance search, and const missing
> calls
> > exposes "primitive" operations via the IR rather than hiding them in the
> > runtime (b) Some const lookups (colon2, colon3) only do inheritance
> lookup,
> > and it seems like less work to do (though mostly insignificant) and also
> > because it more accurately reflects what is going on as in (a).    This
> > could be a good debugging benefits for a Ruby programmer, for ex. to know
> > why their const lookup failed.
>
> Having the steps of colon2 and colon3 split out into IR operations
> will indeed make them easier for me to optimize, since retrieving the
> LHS of the :: and then retrieving the RHS will both be seen as
> separate operations. The current compiler is not quite as clean when
> it comes to dealing with those two, and can only optimize "simple"
> const lookup with indy.
>
> But, while I appreciate that, and the fact that more logic is split
> out in the instructions for the JIT...it does lead to a lot more
> bytecode being generated. Example: https://gist.github.com/2468662
>
> Of course I don't *know that it's too much code, but some methods have
> a lot of constants in them...it's hard to say.
>
> Anyway, I got hung up implementing constant lookup because of all the
> instructions; I set aside time this afternoon to implement a couple
> instrs, and it kept becoming more and more :) I'll see if I can't
> implement the full const lookup logic tomorrow.
>
> > Now, the other reason why I split it is because sometimes you could
> optimize
> > individual pieces of constant lookup -- for example, determine that
> lexical
> > lookup will never/always succeed and ignore one or the other pieces and
> > hardwire that result.   Ex: class C; def self.foo; 5; end; end; p C.foo
> ..
> > this lookup of C will always succeed lexically.  You can imagine the
> other
> > scenario.  A lot of constant lookups probably fall in these 2 categories.
> > So, this is an opportunity for optimization (which hasn't been
> implemented
> > yet) once we verify that we have the semantics right.
>
> Yes, that's probably true, but the problem I always ran into trying to
> optimize one side or the other was mostly because lexical lookup comes
> first. Because of that, any of the lexically-enclosing modules could
> gain that constant at any time. This is ultimately why we only have a
> global invalidation for constants, and why I've never had much luck
> splitting the logic into pieces...they all have to check the same
> single token to cache anything, so I just stick them all behind the
> cache and validate only once that way.
>
> > On the other hand, it is true that you can hide the whole thing behind a
> > indy call site -- but, it is possible to do the same with the individual
> > pieces, but probably more runtime code.  With Dalvik without indy
> support,
> > the story is a little different, and I think there are benefits from
> > individual optimizable lookup pieces.
> >
> > Anyway, at this time, I am not sure if this is helping more or getting in
> > the way more -- something we can investigate/experiment with.
>
> Sure, it's not insurmountable. The use of this special UNDEFINED value
> probably needs to change to RubyObject.UNDEF or similar, so I have
> something I can more easily lookup in the compiler. And the constant
> lookup interpreter impls use some runtime checks against operations
> I'm going to have to think about for the JVM backend. But it's all
> doable, and most of it probably boils down to a single call in each
> case and some way of recognizing failed lookups.
>
> - Charlie
>
> ----------
> From: *Subramanya Sastry* <sss.li...@gmail.com>
> Date: Mon, Apr 23, 2012 at 4:23 AM
> To: Charles Oliver Nutter <head...@headius.com>
>
>
>
>
> Having the steps of colon2 and colon3 split out into IR operations
>> will indeed make them easier for me to optimize, since retrieving the
>> LHS of the :: and then retrieving the RHS will both be seen as
>> separate operations. The current compiler is not quite as clean when
>> it comes to dealing with those two, and can only optimize "simple"
>> const lookup with indy.
>>
>> But, while I appreciate that, and the fact that more logic is split
>> out in the instructions for the JIT...it does lead to a lot more
>> bytecode being generated. Example: https://gist.github.com/2468662
>>
>> Of course I don't *know that it's too much code, but some methods have
>> a lot of constants in them...it's hard to say.
>>
>
>
> Let's see how this actually behaves when we get things running and what
> this does to performance.  Not too hard to roll back, if necessary.
>
>
> Anyway, I got hung up implementing constant lookup because of all the
>> instructions; I set aside time this afternoon to implement a couple
>> instrs, and it kept becoming more and more :) I'll see if I can't
>> implement the full const lookup logic tomorrow.
>>
>> > Now, the other reason why I split it is because sometimes you could
>> optimize
>> > individual pieces of constant lookup -- for example, determine that
>> lexical
>> > lookup will never/always succeed and ignore one or the other pieces and
>> > hardwire that result.   Ex: class C; def self.foo; 5; end; end; p C.foo
>> ..
>> > this lookup of C will always succeed lexically.  You can imagine the
>> other
>> > scenario.  A lot of constant lookups probably fall in these 2
>> categories.
>> > So, this is an opportunity for optimization (which hasn't been
>> implemented
>> > yet) once we verify that we have the semantics right.
>>
>> Yes, that's probably true, but the problem I always ran into trying to
>> optimize one side or the other was mostly because lexical lookup comes
>> first. Because of that, any of the lexically-enclosing modules could
>> gain that constant at any time.
>>
>
> Is this always true, i.e., does this dynamic addition of constants affect
> all lexical const lookups?  I would like to understand this better tomorrow.
> Are you ever on IM?  I have yahoo/gchat (subbu_ss/sss.lists).  If you want
> to discuss on IRC, we can do that as well.
>
> Subbu.
>
> ----------
> From: *Subramanya Sastry* <sss.li...@gmail.com>
> Date: Thu, Apr 26, 2012 at 1:16 AM
> To: Charles Oliver Nutter <head...@headius.com>
> Cc: Thomas Enebo <tom.en...@gmail.com>
>
>
>
> I added back a SearchConstInstr which does all 3 things (lexical lookup,
> inheritance lookup, const_missing invocation).   I didn't touch colon2 and
> colon3 however.  They continue to use split instructions for
> InheritanceSearch and ConstMissing.   You also wanted UndefinedValue to
> move to RubyObject.  Yet to be done -- it is still used in the colon2 and
> colon3 cases.  -S.
>
>

Re: [jruby-dev] Fwd: A bit more IR2JVM work on the flight home

Reply via email to