Re: [jvm-l] Battling the code size monster

Jim Baker Fri, 10 Sep 2010 19:24:33 -0700

On Fri, Sep 10, 2010 at 7:58 PM, Charles Oliver Nutter
<[email protected]>wrote:


> On Fri, Sep 10, 2010 at 8:57 PM, Jim Baker <[email protected]> wrote:
> > We also thought about doing some sort of code splitting for Jython, but
> > punted since the code analysis seems unnecessarily difficult except in
> the
> > simplest case.
>
> Yeah, it's hard to do reliably, unless you have build the compiler to
> support it right away (as in Remi's case). We also have a new
> intermediate-representation compiler we're working on (or rather, a
> contributor has been graciously developing for us) that will make
> splitting easier, but it's not integrated into execution yet.
>

We probably want to look at that compiler for Jython 2.6, but more to
support gradual typing and no-frame support. This use case of large
functions (including top level module scripts) just comes up too rarely in
actual Python code. This may be because as I understand it, generating
Python code is much rarer than perhaps the case in Ruby.


> > We instead implemented a Python bytecode (PBC) VM to run on
> > top of the JVM, since PBC is reasonably well defined. In particular,
> > bytecode offsets are signed 32 bit, so this supports sufficiently large
> > methods ;). Our experimental functionality is just barely documented
> > in
> http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html#sys-meta-path
> ,
> > and currently relies on using CPython to actually compile to PBC files
> > (.pyc). But the PBC VM works well, without too much overhead given the VM
> on
> > VM aspect. PBC is also more compact than the equivalent Java bytecode.
> > At some future point we will add support to the Jython compiler to emit
> PBC,
> > because it can support not only long methods, but also Android and
> applets
> > without ahead-of-time compilation requirements. We may also revisit the
> use
> > of a large switch statement and choose to use a polymorphic instruction
> > approach.
>
> As you probably can guess, I would *strongly* discourage the big
> switch. It simply doesn't work, at least on Hotspot, and we've now
> replaced all hot "big switch" code in JRuby with polymorphic
> dispatches. The poly dispatch ends up being faster than the switch in
> pretty much every case...sometimes to egregious proportions.
>
> Oh sure... the VM is in PyBytecode.java, it's only about 1600 lines or so,
easy enough to convert. So when it graduates from experimental, we will
definitely do that. We got a couple of other mini VMs we will convert too -
our regex and pickling (Python specific serialization) VMs. I'm actually
much more interested in these from a performance perspective.


> For John Rose and others who might ask about this: I think it's both
> the lack of profiling for switch paths and the fact that no one
> profile can really reflect the behavior of, for example, a bytecode VM
> which may have roughly equivalent frequency of each case. But I have
> not actually looked into the graph or assembly that results from
> compiling big hot switches (almost sounds inappropriate for mixed
> company).
>

One thing that comes to mind - I wonder how the polymorphic instruction
approach compares to the PyPy JIT, which is basically generates JITs
optimized for executing bytecode VMs. That would be a truly interesting
comparison.

- Jim

-- 
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en.

Re: [jvm-l] Battling the code size monster

Reply via email to