[jvm-l] class file size limits

Per Bothner Mon, 06 Jul 2009 11:55:32 -0700

When I explained the JVM's method size limit to somebody
trying to run a large compiler-generated Scheme file
on Kawa, he commented:


 > I thought Java was a modern language: but modern languages
 > are not supposed to have arbritrary and pointless restrictions. :-(

Which of course is a valid point.  It's really embarassing
in this day and age that we're still using such a lame classfile
format.

Binary compatibility should not be an issue: a class file generated
by javac targeting Java 6 is not going to run on Java 5, and
one targeting Java 7 is not going to run on Java 6.

Of course there are ways to split up a large method into multiple
methods or multiple class files, but of course that is the wrong
approach: Asking every language implementation to implement a
non-trivial *de*-optimization to work around a broken class file
format is really the wrong approach.

Of course it's not clear the best way to fix the format.
A simple-minded fix would replace all the u2 types by u4 types.
This would increase the class size by a bit, though perhaps
not all that much after compression.  Some adaptive mechanism
would be better, so that compilers generate "large-model
class files" only when necessary.

Now if re-doing the class-file-format one would prefer to make
other fixes - for example allow multiple classes in the same
class file (to reduce constant pool duplication) or drop
old-style attributes that are subsumed by new attribute types.

However, let's not make the perfect be the enemy of the good
- at the very least we really should fix the size limitations,
because that is a hard limit.  Having all these 16-bit limits
when the world is moving to 64-bit is ridiculous.

A strawman proposal - though I'm guessing other and
better-thought-out designs exist out there:

If the u2 constant_pool_count is zero, then the real
constant_pool_count follows as a u4.  In that case, the
this_class, super_class, fields_count, methods_count, and
attributes_count are also u4.  Likewise indexes and lengths
in field and method descriptors.

Each constant-pool entry has the same 16-bit format as now,
unless the tag has the high-order bit set, in which case
all the u2 indexes are u4.

(Notice it's helpful to allow a "long-mode classfile" to
contain mix of short-mode and long-mode entries of both
constants, fields, and methods, so a compiler can start by
generating short-mode entries, and only emit long-mode entries
when needed. )

For the various attributes, they can be in either short
mode or long mode - and a class file can have a mix.
One way we could indicate a long-mode attribute is that the
attribute_length has the high-order bit set.

More tricky is the actual Code attribute.  It's preferable
for it "adaptive": Rather than a global flag that switch
between 16-bit and 32-bit offsets, perhaps it would be
better to have an extension of the wide opcode could be used
to indicate 32-bit offsets, since that makes it generate
code on-the-fly without having to know in advance if we
need a long-mode method.
-- 
        --Per Bothner
[email protected]   http://per.bothner.com/

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

[jvm-l] class file size limits

Reply via email to