Re: [jvm-l] Battling the code size monster

Rémi Forax Sat, 11 Sep 2010 05:45:15 -0700

Le 11/09/2010 04:11, Charles Oliver Nutter a écrit :

On Fri, Sep 10, 2010 at 4:53 PM, Rémi Forax<[email protected]>  wrote:

Like Matt says, I do variable-uses analysis.

Our new compiler will be able to do that. It's too hard to do against
the AST (or hard enough that I don't want to try).


It's not hard !
create a fresh scope, when you find a new variable wich
is not in the scope, find it in the old scope, and register that
the part of the script need this variable.

Of course banking on "the new compiler" is always a gamble :)

You can push state on heap only between to block of N lines.
It's a kind of register spilling.

I don't understand. Can you elaborate?


Because you know which variables you need,
for each block of code, you only use local variables,
between two blocks of code, you get the values of local
variables you will need later and store them in
heap allocated objects. Then you take the values
from the heap allocated objects to the next block of code.

block
  inputs:
  outputs:
--------------------------------
store all output variable on heap
load all next block input variables from heap
--------------------------------
next block
  inputs:
  outputs:

Each block is a function that takes an array of heap
allocated objects corresponding to all variables
and store them into local variables
The block only use local variables, so you have
to inline closures here :)
At the end of the block, take all local variables
that will be needed on next block and
store them into the heap allocated objects.

In my implementation, the heap allocated objects
used to store a variable value, is also the object
used by the interpreter. So the transition from
the interpreter to the compiler use the same
mechanism.

I also transform loop to function, i.e I cut before the loop
and integrate the loop and the loop body in a new function.
Because I use loop counter, I'm able to know if I need to aggregate
a loop and its inner loop or not.

Yes, that's a split I've considered a few times, but the lack of
variable-use analysis makes it impossible to know if it's safe.


It don't have to be safe because at the end of the code
you take the value from the stack to the heap.
So it's always safe :)

The reported JRuby bugs have thankfully only been for two cases:

* A large flat method body, like the main body of a script
* Large hash or array initializations that are usually all literals

In the first case, splitting every N lines is easy. For the array and
hash initialization, we have a path that will scan the AST to ensure
all nodes are literals, and then break it up in some way. I haven't
taken it farther because nobody's reported real-world cases that
break...but I know they're out there. And of course, because JRuby is
quite happy to bail out on a compilation and keep interpreting, users
may or may not know that their code has been abandoned because of
size. The dynopt work will only aggravate this, so I may do a two-pass
compile: try to dynopt first, which is much faster but produces larger
code, and if that fails fall back on the old inline-cached version
that's lighter. It will still be faster than interpreting, and the
compilation cost is not particularly high for the AST-based compiler
(the IR-based compiler, with its multiple passes, is another
matter...).


As you already know, I don't think that an IR-based compiler
is a good idea. javac is not an IR-based compiler and
Java is fast.

I suppose this is another technique both we and Jython use: execute
another way. JRuby has always had an interpreter, and I decided early
on when I joined the project that the interpreter should stay. I think
that's ultimately been a very good move, since it has both enabled
dynamic optimization (even without indy and method handles, which make
such optimization easier against compiled code) and given us a
fallback path when compilation fails for any reason (interpretation is
not fast...but it does execute, which is more than you can say for
compiled-only languages that hit bytecode or JIT limits and either
don't run or run incredibly slow).

- Charlie


Rémi

--
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en.

Re: [jvm-l] Battling the code size monster

Reply via email to