On Mon, Jul 6, 2009 at 1:49 PM, Per Bothner<[email protected]> wrote: > Of course there are ways to split up a large method into multiple > methods or multiple class files, but of course that is the wrong > approach: Asking every language implementation to implement a > non-trivial *de*-optimization to work around a broken class file > format is really the wrong approach.
Not to mention that it's sometimes impossible to split things easily. Imagine the case where you have a lot of embedded conditional or loop logic manipulating variables at many levels of scoping. How do you split that? We run into this case even now with JRuby, where a large Ruby method can produce a gigantic amount of bytecode, with no easy indication where it could be split into multiple pieces. > Of course it's not clear the best way to fix the format. > A simple-minded fix would replace all the u2 types by u4 types. > This would increase the class size by a bit, though perhaps > not all that much after compression. Some adaptive mechanism > would be better, so that compilers generate "large-model > class files" only when necessary. > > Now if re-doing the class-file-format one would prefer to make > other fixes - for example allow multiple classes in the same > class file (to reduce constant pool duplication) or drop > old-style attributes that are subsumed by new attribute types. > > However, let's not make the perfect be the enemy of the good > - at the very least we really should fix the size limitations, > because that is a hard limit. Having all these 16-bit limits > when the world is moving to 64-bit is ridiculous. I know many people have asked for everything you mention, and the truth is that even in JRuby's case, where we've bent the JVM over backwards, we still have to work around most of these same limitations. So would I support fixing them all? Absolutely. But I suppose the limiting factor is getting someone to lead a JSR. For better or worse, that's how changes get in. The alternative would be to just hack these changes into javac and the verifier ourselves, show how much nicer they are, and get them fast-tracked through the JCP. That seems to be the most rapid way to spin changes. > A strawman proposal - though I'm guessing other and > better-thought-out designs exist out there: > > If the u2 constant_pool_count is zero, then the real > constant_pool_count follows as a u4. In that case, the > this_class, super_class, fields_count, methods_count, and > attributes_count are also u4. Likewise indexes and lengths > in field and method descriptors. > > Each constant-pool entry has the same 16-bit format as now, > unless the tag has the high-order bit set, in which case > all the u2 indexes are u4. > > (Notice it's helpful to allow a "long-mode classfile" to > contain mix of short-mode and long-mode entries of both > constants, fields, and methods, so a compiler can start by > generating short-mode entries, and only emit long-mode entries > when needed. ) > > For the various attributes, they can be in either short > mode or long mode - and a class file can have a mix. > One way we could indicate a long-mode attribute is that the > attribute_length has the high-order bit set. > > More tricky is the actual Code attribute. It's preferable > for it "adaptive": Rather than a global flag that switch > between 16-bit and 32-bit offsets, perhaps it would be > better to have an extension of the wide opcode could be used > to indicate 32-bit offsets, since that makes it generate > code on-the-fly without having to know in advance if we > need a long-mode method. This sounds ok to me, but I'm not a .class format expert. - Charlie --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en -~----------~----~----~----~------~----~------~--~---
