When I explained the JVM's method size limit to somebody
trying to run a large compiler-generated Scheme file
on Kawa, he commented:
> I thought Java was a modern language: but modern languages
> are not supposed to have arbritrary and pointless restrictions. :-(
Which of course is a valid point. It's really embarassing
in this day and age that we're still using such a lame classfile
format.
Binary compatibility should not be an issue: a class file generated
by javac targeting Java 6 is not going to run on Java 5, and
one targeting Java 7 is not going to run on Java 6.
Of course there are ways to split up a large method into multiple
methods or multiple class files, but of course that is the wrong
approach: Asking every language implementation to implement a
non-trivial *de*-optimization to work around a broken class file
format is really the wrong approach.
Of course it's not clear the best way to fix the format.
A simple-minded fix would replace all the u2 types by u4 types.
This would increase the class size by a bit, though perhaps
not all that much after compression. Some adaptive mechanism
would be better, so that compilers generate "large-model
class files" only when necessary.
Now if re-doing the class-file-format one would prefer to make
other fixes - for example allow multiple classes in the same
class file (to reduce constant pool duplication) or drop
old-style attributes that are subsumed by new attribute types.
However, let's not make the perfect be the enemy of the good
- at the very least we really should fix the size limitations,
because that is a hard limit. Having all these 16-bit limits
when the world is moving to 64-bit is ridiculous.
A strawman proposal - though I'm guessing other and
better-thought-out designs exist out there:
If the u2 constant_pool_count is zero, then the real
constant_pool_count follows as a u4. In that case, the
this_class, super_class, fields_count, methods_count, and
attributes_count are also u4. Likewise indexes and lengths
in field and method descriptors.
Each constant-pool entry has the same 16-bit format as now,
unless the tag has the high-order bit set, in which case
all the u2 indexes are u4.
(Notice it's helpful to allow a "long-mode classfile" to
contain mix of short-mode and long-mode entries of both
constants, fields, and methods, so a compiler can start by
generating short-mode entries, and only emit long-mode entries
when needed. )
For the various attributes, they can be in either short
mode or long mode - and a class file can have a mix.
One way we could indicate a long-mode attribute is that the
attribute_length has the high-order bit set.
More tricky is the actual Code attribute. It's preferable
for it "adaptive": Rather than a global flag that switch
between 16-bit and 32-bit offsets, perhaps it would be
better to have an extension of the wide opcode could be used
to indicate 32-bit offsets, since that makes it generate
code on-the-fly without having to know in advance if we
need a long-mode method.
--
--Per Bothner
[email protected] http://per.bothner.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---