Re: large Class.forName() patch
Just to make this clear. I don't plan on doing anything. I'm just throwing out ideas here. But if I implemented it, I think I would use such a flag to special-case regular strings. Artur's numbers seem useful. However, as you say, a full investigation would be needed to quantify the benefits gained from a utf8-based implementation. > > > If somebody > > implements full String/StringBuffer class using utf8 we could perform > > some real tests > > I guess you mean full String using UCS2 (2 bytes per character)? A UTF8 > implementation is what is there now. > No, the current implementation uses only 2 bytes per character strings for all java.lang.String objects. Internally, we store some information as utf8s, but I think this discussion was about how to best implement java.lang.String. - Godmar
Re: large Class.forName() patch
On Tue, 8 Feb 2000, Artur Biesiadowski wrote: > > Also, I think that charAt() can be sped up for "regular" strings > > that contain only ASCII chars < 127, because those can be represented in > > a byte array. So you have to ask what strings you're application is > > dealing with. That may depend on what i18n is used etc. > > Yes, it will go faster as no decoding would have to be done for every > separate byte, but do you plan using special flag for marking such > Strings ? This is probably not a bad idea. Strings are immutable, so you only have to set the flag once when the string is created. You will save a lot of time with ASCII strings (by far the most common case) by using this one extra bit of memory and checking it when you do the charAt. On the other hand, you don't want the extra time taken creating strings to outweigh the benefit. > If somebody > implements full String/StringBuffer class using utf8 we could perform > some real tests I guess you mean full String using UCS2 (2 bytes per character)? A UTF8 implementation is what is there now.
Re: large Class.forName() patch
Godmar Back wrote: > > Even for runtime efficiency, you need to look at profiling data and > decide whether it's worth it. Is charAt() really the most frequent > operation on java.lang.String? For which application or set of > applications? > I had no time to implement full String using utf8, so I emulated slower utf8 behaviour by adding loop for normal charAt, which traverses char array from start to point of check, compares if it is greater than 127 and if yes assigns it to itself. Of course it is quite stupid opeartion but I think it should be something in order of decoding utf8 (and same as there there is only one 'if' and no more operations per character if it is < 127). I don't suppose that working on bytes nor charater will change speed here, so it should be good approximate. I've changed only String.charAt - there will be more functions that need to be changed possibly slowing it down further. I've choosen quite String intensive application - I was compiling SwingSet with kjc. Test was done under JDK 1.2v1 for linux with default sun jit. With normal String kjc compiled everything in about 11 seconds (measured by time command, user time - together with startup, dynamic linking of libraries et all). With slower String it took slightly over 14.5 seconds. As to how common is String.charAt - I've changed tya to count all invokes and invokes to String.charAt at runtime. I've run SwingSet demo and then clicked on every tab, but without meddling with components. Ratio String.charAt/all calls was 309108/8611457. So about 3,5% of all java calls in graphics heavy application is String.charAt. It is not something to ignore - but making it considerably slower you impact all java applications in visible manner - and remeber that it is with only single method changed. > Also, I think that charAt() can be sped up for "regular" strings > that contain only ASCII chars < 127, because those can be represented in > a byte array. So you have to ask what strings you're application is > dealing with. That may depend on what i18n is used etc. Yes, it will go faster as no decoding would have to be done for every separate byte, but do you plan using special flag for marking such Strings ? If not, you will have to traverse it one by one anyway. > I don't think one should rush to conclusions without data to back it up. > It may or may not be a benefit, and most likely it will be sometimes. No I had no data at time I wrote last mail and even currently I've done only two small tests, but I have a general impression that String performance is one of most important things for speed of java. I'm sure I have reads few articles about it - unfortunately no links, so you can suppose it is just my impression. At last, I would like to agree with you that there are some ways in which such implementation could perform better - memory usage, faster copying and of course integration with internals of vm (which was a reason for it in first place looking at this thread). If somebody implements full String/StringBuffer class using utf8 we could perform some real tests - what I'm trying to say is that speed of String, especiall charAt is something to really care about - it is one of hotspots for EVERY (or almost every:) java application. Artur
Re: large Class.forName() patch
> > > Godmar Back wrote: > > > Thinking about it, maybe it would even be worth thinking about > > not implementing Strings as char[] arrays internally, but as utf8strings. > > This would save space for all ASCII strings, it would allows to directly > > use a literal's utf8 representation. Since strings are immutable, the > > internal char[] array is never directly exposed anyway. > > It is done this way in gcj I think. But I do not suppose that it is a > good thing for java only implementation - main aim for Strings should be > efficiency. I think that String.charAt is method called often enough to > cause major performance drop in entire java app with such strings. > We may be talking about different things here, but I don't think gcj stores strings as utf8strings in the way I propose. Look at their implementation of charAt(), which is mapped to "(jchar*)((char*) str->data + str->boffset);" via JvGetStringChars. Saying that efficiency should be the main aim for Java begs the question what kind of efficiency: you imply runtime efficiency, but space efficiency can sometimes be more important. Even for runtime efficiency, you need to look at profiling data and decide whether it's worth it. Is charAt() really the most frequent operation on java.lang.String? For which application or set of applications? Also, I think that charAt() can be sped up for "regular" strings that contain only ASCII chars < 127, because those can be represented in a byte array. So you have to ask what strings you're application is dealing with. That may depend on what i18n is used etc. I don't think one should rush to conclusions without data to back it up. It may or may not be a benefit, and most likely it will be sometimes. - Godmar ps: I think Alexandre is right about class names not being able to contain a \u0: while they're represented as a ConstantUtf8Info, the spec also says Class names that appear in class file structures are always represented in a fully qualified form (§2.7.9). These class names are always represented as CONSTANT_Utf8_info (§4.4.7) structures, and they are referenced from those CONSTANT_NameAndType_info (§4.4.6) structures that have class names as part of their descriptor (§4.3), as well as from all CONSTANT_Class_info (§4.4.1) structures. For historical reasons the exact syntax of fully qualified class names that appear in class file structures differs from the familiar Java fully qualified class name documented in §2.7.9. In the internal form, the ASCII periods ('.') that normally separate the identifiers (§2.2) that make up the fully qualified name are replaced by ASCII forward slashes ('/'). For example, the normal fully qualified name of class Thread is java.lang.Thread. In the form used in descriptors in class files, a reference to the name of class Thread is implemented using a CONSTANT_Utf8_info structure representing the string "java/lang/Thread" And if you follow the link the identifiers, it'll says that it's a sequence of isJavaLetterOrDigit chars, which are letters, digit, _ and $. That said, it doesn't solve the problem that we must throw an exception if the user looks up "java.lang.String\0" or the like.
Re: large Class.forName() patch
Godmar Back wrote: > Thinking about it, maybe it would even be worth thinking about > not implementing Strings as char[] arrays internally, but as utf8strings. > This would save space for all ASCII strings, it would allows to directly > use a literal's utf8 representation. Since strings are immutable, the > internal char[] array is never directly exposed anyway. It is done this way in gcj I think. But I do not suppose that it is a good thing for java only implementation - main aim for Strings should be efficiency. I think that String.charAt is method called often enough to cause major performance drop in entire java app with such strings. Artur
Re: large Class.forName() patch
> > > On Feb 4, 2000, Mo DeJong <[EMAIL PROTECTED]> wrote: > > > The current Kaffe implementation means that a \0 embedded in the > > class name will screw up code that expects a \0 at the end of a > > string. > > AFAIK, there can't be \0s embedded in class names. Utf8strings can have \u0 characters in them, and the JVM spec says that the class name is a CONSTANT_Utf8Info, so it can have a \u0 in it as well. Why should it not? Limitations on characters in your identifier make sense at the java source level, but they're not necessary at the bytecode level. > And, in any case, > utf8 strings won't contain a \0 char unless they actually contain \u0. > Not so. On the contrary, utf8strings encode the \u0 with something different than \0. The convenience of utf8 in the JVM spec lies in that it can encode all unicode character without using a \0. Hence, they can be represented as zero-terminated strings, which is what we do. The problem comes in when we convert either 16bit unicode java strings or utf8strings to zero-terminated 8bit C strings. Then we can't represent the \u0 character (among others), and problems arise. If we directly converted from unicode java to utf8strings, there would be no problem. Thinking about it, maybe it would even be worth thinking about not implementing Strings as char[] arrays internally, but as utf8strings. This would save space for all ASCII strings, it would allows to directly use a literal's utf8 representation. Since strings are immutable, the internal char[] array is never directly exposed anyway. - Godmar
Re: large Class.forName() patch
On Feb 4, 2000, Mo DeJong <[EMAIL PROTECTED]> wrote: > The current Kaffe implementation means that a \0 embedded in the > class name will screw up code that expects a \0 at the end of a > string. AFAIK, there can't be \0s embedded in class names. And, in any case, utf8 strings won't contain a \0 char unless they actually contain \u0. -- Alexandre Oliva http://www.ic.unicamp.br/~oliva IC-Unicamp, Bra[sz]il oliva@{lsd.ic.unicamp.br,guarana.{org,com}} aoliva@{acm,computer}.org oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org} ** I may forward mail about projects to mailing lists; please use them
Re: large Class.forName() patch
> > > > > > > > On Thu, 3 Feb 100, Godmar Back wrote: > > > > Ok, that sounds like a plan. The only other problem I noticed is that > > Kaffe seems to use plain char* types for class names. These class names > > really should be in unicode strings or at least UTF8 strings. The > > current Kaffe implementation means that a \0 embedded in the class name > > will screw up code that expects a \0 at the end of a string. One > > of the regression test checks for this. > > > > We should probably keep all descriptors/identifiers that were derived > from uft8consts in the constant pool as utf8consts all the time. > Actually, I was a bit too fast with my reply. Fortunately, we already use utf8consts to store the class names internally. The problem comes in when we convert from a java string to a C string to a utf8 string. Then we do seem to lose such strings. This is exactly what we do in java_lang_Class_forName. What we need is a method to convert Java strings to utf8consts without going through C strings. - Godmar
Re: large Class.forName() patch
> > > On Thu, 3 Feb 100, Godmar Back wrote: > > Ok, that sounds like a plan. The only other problem I noticed is that > Kaffe seems to use plain char* types for class names. These class names > really should be in unicode strings or at least UTF8 strings. The > current Kaffe implementation means that a \0 embedded in the class name > will screw up code that expects a \0 at the end of a string. One > of the regression test checks for this. > We should probably keep all descriptors/identifiers that were derived from uft8consts in the constant pool as utf8consts all the time. - Godmar
Re: large Class.forName() patch
On Thu, 3 Feb 100, Godmar Back wrote: Ok, that sounds like a plan. The only other problem I noticed is that Kaffe seems to use plain char* types for class names. These class names really should be in unicode strings or at least UTF8 strings. The current Kaffe implementation means that a \0 embedded in the class name will screw up code that expects a \0 at the end of a string. One of the regression test checks for this. > I'm working on it. We'll first try renaming the basic types internally > w/o breaking gcj (Jason sent me a patch for that), and see where that gets > us, and then the rest. I think we should be able to match Sun's output > exactly. Unfortunately, egcs broke on me right after I checked it out > today so waiting on that to be fixed before I can do that. You can add it to Kaffe's regression tests but I am planning on turning it into a Mauve test so you should not really need to add it. Mo Dejong Red Hat Inc. > Thanks very much for your regression test. > > - Godmar > > > > > It sounds like there is still some debate as to how the > > primitive classes should be searched for by name, so could > > we agree on this trimmed down patch that does not include > > the part that disables lookups for "int" and such? > > >
Re: large Class.forName() patch
I'm working on it. We'll first try renaming the basic types internally w/o breaking gcj (Jason sent me a patch for that), and see where that gets us, and then the rest. I think we should be able to match Sun's output exactly. Unfortunately, egcs broke on me right after I checked it out today so waiting on that to be fixed before I can do that. Thanks very much for your regression test. - Godmar > > It sounds like there is still some debate as to how the > primitive classes should be searched for by name, so could > we agree on this trimmed down patch that does not include > the part that disables lookups for "int" and such? >
Re: large Class.forName() patch
Mo DeJong writes: > It sounds like there is still some debate as to how the > primitive classes should be searched for by name, so could > we agree on this trimmed down patch that does not include > the part that disables lookups for "int" and such? Minor nit: it seems clearer to do this version of your first itypes.c patch.. -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com Index: itypes.c === RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/itypes.c,v retrieving revision 1.17 diff -u -r1.17 itypes.c --- itypes.c1999/11/29 23:44:10 1.17 +++ itypes.c2000/02/03 23:49:21 @@ -131,23 +131,53 @@ Hjava_lang_Class* classFromSig(const char** strp, Hjava_lang_ClassLoader* loader, errorInfo *einfo) { - Hjava_lang_Class* cl; + Hjava_lang_Class* cl = NULL; Utf8Const* utf8; const char* start; const char* end; - switch (*(*strp)++) { - case 'V': return (voidClass); - case 'I': return (intClass); - case 'Z': return (booleanClass); - case 'S': return (shortClass); - case 'B': return (byteClass); - case 'C': return (charClass); - case 'F': return (floatClass); - case 'D': return (doubleClass); - case 'J': return (longClass); - case '[': return (lookupArray(classFromSig(strp, loader, einfo), - einfo)); + /* Check for primitive types */ + switch (*(*strp)) { + case 'V': + cl = voidClass; + break; + case 'I': + cl = intClass; + break; + case 'Z': + cl = booleanClass; + break; + case 'S': + cl = shortClass; + break; + case 'B': + cl = byteClass; + break; + case 'C': + cl = charClass; + break; + case 'F': + cl = floatClass; + break; + case 'D': + cl = doubleClass; + break; + case 'J': + cl = longClass; + break; + } + if (cl != NULL) { /* Any trailing junk is an error */ + if (*++(*strp) != '\0') { + postException(einfo, JAVA_LANG(VerifyError)); + return (NULL); + } + return (cl); + } + + /* Non-primitive types */ + switch (*(*strp)++) { + case '[': + return (lookupArray(classFromSig(strp, loader, einfo), einfo)); case 'L': start = *strp; for (end = start; *end != 0 && *end != ';'; end++)
Re: large Class.forName() patch
It sounds like there is still some debate as to how the primitive classes should be searched for by name, so could we agree on this trimmed down patch that does not include the part that disables lookups for "int" and such? After lookups for "int" and friends are fixed the right way (assuming there is agreement on what that way is), will the following test still work? If we can not search by a class name like "void" then will there be macro tests like CLASS_ISVOID() CLASS_ISINT() added so we can do such tests? if (strcmp(CLASS_CNAME(c), "void") == 0) { // Do Something } Mo Dejong Red Hat Inc. Index: kaffe/kaffevm/classMethod.c === RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/classMethod.c,v retrieving revision 1.75 diff -u -r1.75 classMethod.c --- kaffe/kaffevm/classMethod.c 2000/01/19 11:04:31 1.75 +++ kaffe/kaffevm/classMethod.c 2000/02/03 23:06:12 @@ -1110,12 +1110,13 @@ } class = loadClass(utf8, NULL, einfo); utf8ConstRelease(utf8); + if (class != 0) { if (processClass(class, CSTATE_COMPLETE, einfo) == true) { return (class); } } - return (0); + return (NULL); } /* @@ -2281,12 +2282,18 @@ /* If we couldn't resolve the element type, there's no way we can * construct the array type. */ - if (c == 0) { - return (0); + if (c == NULL) { + return (NULL); } /* Build signature for array type */ if (CLASS_IS_PRIMITIVE (c)) { + /* An array of type void is not allowed */ + if (strcmp(CLASS_CNAME(c), "void") == 0) { + postException(einfo, JAVA_LANG(VerifyError)); + return (NULL); + } + arr_class = CLASS_ARRAY_CACHE(c); if (arr_class) { return (arr_class); @@ -2304,12 +2311,12 @@ arr_name = utf8ConstNew(sig, -1); /* release before returning */ if (!arr_name) { postOutOfMemory(einfo); - return 0; + return (NULL); } centry = lookupClassEntry(arr_name, c->loader, einfo); if (centry == 0) { utf8ConstRelease(arr_name); - return (0); + return (NULL); } if (centry->class != 0) { Index: kaffe/kaffevm/itypes.c === RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/itypes.c,v retrieving revision 1.17 diff -u -r1.17 itypes.c --- kaffe/kaffevm/itypes.c 1999/11/29 23:44:10 1.17 +++ kaffe/kaffevm/itypes.c 2000/02/03 23:06:12 @@ -131,21 +131,49 @@ Hjava_lang_Class* classFromSig(const char** strp, Hjava_lang_ClassLoader* loader, errorInfo *einfo) { - Hjava_lang_Class* cl; + Hjava_lang_Class* cl = NULL; Utf8Const* utf8; const char* start; const char* end; switch (*(*strp)++) { - case 'V': return (voidClass); - case 'I': return (intClass); - case 'Z': return (booleanClass); - case 'S': return (shortClass); - case 'B': return (byteClass); - case 'C': return (charClass); - case 'F': return (floatClass); - case 'D': return (doubleClass); - case 'J': return (longClass); + case 'V': + if (cl == NULL) + cl = voidClass; + case 'I': + if (cl == NULL) + cl = intClass; + case 'Z': + if (cl == NULL) + cl = booleanClass; + case 'S': + if (cl == NULL) + cl = shortClass; + case 'B': + if (cl == NULL) + cl = byteClass; + case 'C': + if (cl == NULL) + cl = charClass; + case 'F': + if (cl == NULL) + cl = floatClass; + case 'D': + if (cl == NULL) + cl = doubleClass; + case 'J': + if (cl == NULL) + cl = longClass; + + if (cl != NULL) { + /* If build in type char is not at the end of the string, malformed +signature */ + if (*(*strp) != 0) { + postException(einfo, JAVA_LANG(VerifyError)); + return (NULL); + } + } + + return (cl); case '[': return (lookupArray(classFromSig(strp, loader, einfo), einfo)); case 'L': @@ -159,11 +187,17 @@ utf8 = utf8ConstNew(start, end - start); if (!utf8) { postOutOfMemory(einfo); - return 0; + return (NULL); } c
Re: large Class.forName() patch
On that note, here is an updated regression test that takes care of the case where you have an actual Class name "int" or "I" and it also checks for / in the name of a class passed to forName(). Mo Dejong Red Hat Inc. On Tue, 1 Feb 2000, Godmar Back wrote: > Let me look at Mo's patch more closely and maybe we find something > that will not break the gcj stuff. I kind of like to keep the primitive > classes in the classpool---this I think is nicer for keeping statistics > on all classes, for instance. > > As for the Class.forName() interface, I think it's better to strive > for 100% Sun compatibility here, which means to not provide backdoor > ways to lookup primitive classes such as looking up "I" or "int" if > Sun doesn't. For instance, I got totally > confused already when I thought that kaffe could look up "I" before I > realized that's because I do all my tests in a directory filled with > A.class, B.class, ..., I.class, ... Z.class files. > > That said, I find Sun's implementation of Class.forName horrible. > Why do I have use the L; form in an array but not for simple classes? > It makes no sense. Of course, the JDK doc also doesn't document it. > And why shouldn't you be able to look primitive classes up by name? > > - Godmar public class ArrayForName { public static void testLoadArray() throws Exception { // Loading by built-in type ID is not allowed // int != I // boolean != Z // long != J // float!= F // double != D // byte != B // short!= S // char != C // void != V expect("I", "Exception"); expect("Z", "Exception"); expect("J", "Exception"); expect("F", "Exception"); expect("D", "Exception"); expect("B", "Exception"); expect("S", "Exception"); expect("C", "Exception"); expect("V", "Exception"); // Not possible to load by builtin type name expect("int", "Exception"); expect("boolean", "Exception"); expect("long","Exception"); expect("float", "Exception"); expect("double", "Exception"); expect("byte","Exception"); expect("short", "Exception"); expect("char","Exception"); expect("void","Exception"); // Test loading an array by built-in type id // int[]== [I // int[][] == [[I // boolean[]== [Z // boolean[][] == [[Z // long[] == [J // long[][] == [[J // float[] == [F // float[][]== [[F // double[] == [D // double[][] == [[D // byte[] == [B // byte[][] == [[B // short[] == [S // short[][]== [[S // char[] == [C // char[][] == [[C expect("[I", "int[]"); expect("[[I", "int[][]"); expect("[Z", "boolean[]"); expect("[[Z", "boolean[][]"); expect("[J", "long[]"); expect("[[J", "long[][]"); expect("[F", "float[]"); expect("[[F", "float[][]"); expect("[D", "double[]"); expect("[[D", "double[][]"); expect("[B", "byte[]"); expect("[[B", "byte[][]"); expect("[S", "short[]"); expect("[[S", "short[][]"); expect("[C", "char[]"); expect("[[C", "char[][]"); // Array of type void is not allowed expect("[V","Exception"); expect("[[V", "Exception"); expect("[[[V", "Exception"); // When loading an array using the built-in // type id, id must be at end of string expect("[II", "Exception"); expect("[ZZ", "Exception"); expect("[JJ", "Exception"); expect("[FF", "Exception"); expect("[DD", "Exception"); expect("[BB", "Exception"); expect("[SS", "Exception"); expect("[CC", "Exception"); expect("[ZZ", "Exception"); expect("[C;", "Exception"); expect("[C\0;", "Exception"); // [L + Class + ; // Primitive Class name is not valid expect("[Lint;", "Exception"); expect("[Lboolean;", "Exception"); expect("[Llong;","Exception"); expect("[Lfloat;", "Exception"); expect("[Ldouble;", "Exception"); expect("[Lbyte;","Exception"); expect("[Lshort;", "Exception"); expect("[Lchar;","Exception"); expect("[Lvoid;","Exception"); // java.lang.Object[] == [Ljava.lang.Object; // java.lang.Object[][] == [[Ljava.lang.Object; // java.lang.String[] == [Ljava.lang.String; // java.lang.String[][] == [[Ljava.lang.String; expe
Re: large Class.forName() patch
> > > > > Jason Baker wrote: > > > ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", ""); > > > > That code is actually commented out already. Godmar will have to > > chime in on its intent. > > > > At first, I thought I just fake the java::lang::Class _Jv_intClass structure > as a big bss symbol. Then I realized this doesn't work, and the structure > is now defined in gcj-class.c > > > > ./kaffe/kaffevm/itypes.c: initPrimClass(&intClass, "int", 'I', 4); > > > > I suggest using some completely bogus, yet human-readable name in this > > context. The primitive types should never be looked up by name, so > > "[int]" or ";primitive int" or ... hmm, looking in the VM spec (for I like ";int". (But not on that line, getName() should still work.) > > invalid name ideas), I noticed actually it says in S2.2 Identifiers: > > "An identifier must not be ... a keyword in the Java programming > > language." So, maybe a class with name 'int' is bogus Something > > that most JVMs don't seem to check, it seems. > > > > > ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = >Class.getPrimitiveClass("int"); > > > > As Jason pointed out, this is an internal hack for Kaffe. The > > argument could just as easily be Class.PRIMITIVE_INT_MAGIC > > I said nothing of the kind. Since getPrimitiveClass doesn't look the name up in any hashtable, no PRIMITIVE_INT_MAGIC is needed. But, it could be changed to do { return forName0(';' + name); }, then the check for weird primitive classes could be done in java code with name.charAt(0) == ';'. > That said, I find Sun's implementation of Class.forName horrible. > Why do I have use the L; form in an array but not for simple classes? I'd like to know why the L notation uses dots instead of slashes. > It makes no sense. Of course, the JDK doc also doesn't document it. > And why shouldn't you be able to look primitive classes up by name? Because they aren't bytecode keywords! Jason
Re: large Class.forName() patch
> > Jason Baker wrote: > > ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", ""); > > That code is actually commented out already. Godmar will have to > chime in on its intent. > At first, I thought I just fake the java::lang::Class _Jv_intClass structure as a big bss symbol. Then I realized this doesn't work, and the structure is now defined in gcj-class.c > > ./kaffe/kaffevm/itypes.c: initPrimClass(&intClass, "int", 'I', 4); > > I suggest using some completely bogus, yet human-readable name in this > context. The primitive types should never be looked up by name, so > "[int]" or ";primitive int" or ... hmm, looking in the VM spec (for > invalid name ideas), I noticed actually it says in S2.2 Identifiers: > "An identifier must not be ... a keyword in the Java programming > language." So, maybe a class with name 'int' is bogus Something > that most JVMs don't seem to check, it seems. > > > ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = >Class.getPrimitiveClass("int"); > > As Jason pointed out, this is an internal hack for Kaffe. The > argument could just as easily be Class.PRIMITIVE_INT_MAGIC > I think this sounds about right. gcj still uses the same scheme by giving the primitive classes internally the names "int" etc. See DECLARE_PRIM_TYPE in libgcj/libjava/prims.cc So if you changed that, we'd have to map it in gcj/gcj-class.c. I do remember that I needed the "int" etc. in the class pool for gcj code to work; I don't remember the details now. It did fix something. Note that Class.getPrimitiveClass does not appear to be an exported function, but a function that's package-internal to java.lang, presumably just to implement the Number.TYPE initializations. I think it may date back when Kaffe used Sun's classes.zip, which explains the native implementation of it. I think Pat is right that the use of "int" in the old kaffe code in java/lang/*.java and their inclusion in the class pool for gcj support are unrelated. Let me look at Mo's patch more closely and maybe we find something that will not break the gcj stuff. I kind of like to keep the primitive classes in the classpool---this I think is nicer for keeping statistics on all classes, for instance. As for the Class.forName() interface, I think it's better to strive for 100% Sun compatibility here, which means to not provide backdoor ways to lookup primitive classes such as looking up "I" or "int" if Sun doesn't. For instance, I got totally confused already when I thought that kaffe could look up "I" before I realized that's because I do all my tests in a directory filled with A.class, B.class, ..., I.class, ... Z.class files. That said, I find Sun's implementation of Class.forName horrible. Why do I have use the L; form in an array but not for simple classes? It makes no sense. Of course, the JDK doc also doesn't document it. And why shouldn't you be able to look primitive classes up by name? - Godmar
Re: large Class.forName() patch
Jason Baker wrote: > ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", ""); That code is actually commented out already. Godmar will have to chime in on its intent. > ./kaffe/kaffevm/itypes.c: initPrimClass(&intClass, "int", 'I', 4); I suggest using some completely bogus, yet human-readable name in this context. The primitive types should never be looked up by name, so "[int]" or ";primitive int" or ... hmm, looking in the VM spec (for invalid name ideas), I noticed actually it says in S2.2 Identifiers: "An identifier must not be ... a keyword in the Java programming language." So, maybe a class with name 'int' is bogus Something that most JVMs don't seem to check, it seems. > ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = >Class.getPrimitiveClass("int"); As Jason pointed out, this is an internal hack for Kaffe. The argument could just as easily be Class.PRIMITIVE_INT_MAGIC -Pat - - --- --- -- -- - - - Pat Tullmann [EMAIL PROTECTED] That which does not kill you just didn't try hard enough.
Re: large Class.forName() patch
Archie Cobbs <[EMAIL PROTECTED]> writes: > Jason Baker writes: > > > #define INTEGER_PRIMITIVE_CLASS "I" > > > > > > and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work? > > > > No, its even worse, since a class in the anonymous package is alot > > more likely to be called "I" than "int". We can use "I" only where a > > class would be L;. > > Good point.. what about > > #define INTEGER_PRIMITIVE_CLASS ";I" > #define BOOLEAN_PRIMITIVE_CLASS ";Z" > That works, but why are we looking up primitve types by name anyway? nephi(14) find . -type f -print | xargs grep '"int"' ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", ""); ./kaffe/kaffevm/itypes.c:* Note that a user-defined "class int" is impossible cause "int" ./kaffe/kaffevm/itypes.c: initPrimClass(&intClass, "int", 'I', 4); ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = Class.getPrimitiveClass("int"); getPrimitveClass does not look up types by name, so the only problem is in developers/fixup.c. Why can't it just translate _Jv_intClass to intClass? Jason
Re: large Class.forName() patch
Jason Baker writes: > > #define INTEGER_PRIMITIVE_CLASS "I" > > > > and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work? > > No, its even worse, since a class in the anonymous package is alot > more likely to be called "I" than "int". We can use "I" only where a > class would be L;. Good point.. what about #define INTEGER_PRIMITIVE_CLASS ";I" #define BOOLEAN_PRIMITIVE_CLASS ";Z" etc.. ? ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: large Class.forName() patch
Archie Cobbs <[EMAIL PROTECTED]> writes: > > #define INTEGER_PRIMITIVE_CLASS "I" > > and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work? No, its even worse, since a class in the anonymous package is alot more likely to be called "I" than "int". We can use "I" only where a class would be L;. Jason
Re: large Class.forName() patch
Mo DeJong <[EMAIL PROTECTED]> writes: > Ok, if you have a class named "int" then the JVM should load > that when you call forName("int"). I am not sure what Kaffe > would do in this case because it already uses the name "int" > for the primitive class "I". I think the best way to make > the regression test check for this is to ensure that > a loaded class (like "int") is not a primitive type. ClassMethod.c is not to blame, the problem lies in itypes.c. What happens when GCJ code is loaded after the class int? You can solve the forName problem by backing out this change and finding another way to expose primitive types to GCJ code. Since GCJ support doesn't actually work, backing out the change along is another option. Index: itypes.c === RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/itypes.c,v retrieving revision 1.16 retrieving revision 1.17 diff -c -r1.16 -r1.17 *** itypes.c1999/11/04 20:29:12 1.16 --- itypes.c1999/11/29 23:44:10 1.17 *** *** 36,42 --- 36,45 void initPrimClass(Hjava_lang_Class** class, char* name, char sig, int len) { + errorInfo info; + classEntry* centry; Hjava_lang_Class* clazz = newClass(); + if (clazz == 0) { goto bad; } *** *** 56,61 --- 59,78 TYPE_PRIM_SIZE(clazz) = len; /* prevent any attempt to process those in processClass */ clazz->state = CSTATE_COMPLETE; + + /* Add primitive types to the class pool as well +* This allows us to find them by name---for instance from gcj code. +* +* Note that a user-defined "class int" is impossible cause "int" +* is a keyword. +*/ + centry = lookupClassEntry(clazz->name, 0, &info); + if (centry == 0) { + goto bad; + } + clazz->centry = centry; + centry->class = clazz; + return; bad: fprintf(stderr, "not enough memory to run kaffe\n"); Jason
Re: large Class.forName() patch
Mo DeJong writes: > Ok, if you have a class named "int" then the JVM should load > that when you call forName("int"). I am not sure what Kaffe > would do in this case because it already uses the name "int" > for the primitive class "I". I think the best way to make > the regression test check for this is to ensure that > a loaded class (like "int") is not a primitive type. So we should fix kaffe internally. It shouldn't use "int" as an internal name for the integer primitive class, because that clashes with int.class; instead, let's use "I" like we're supposed to, and maybe define a macro: #define INTEGER_PRIMITIVE_CLASS "I" and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work? -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: large Class.forName() patch
Ok, if you have a class named "int" then the JVM should load that when you call forName("int"). I am not sure what Kaffe would do in this case because it already uses the name "int" for the primitive class "I". I think the best way to make the regression test check for this is to ensure that a loaded class (like "int") is not a primitive type. Mo Dejong Red Hat Inc. On Mon, 31 Jan 2000, Patrick Tullmann wrote: > I ran ArrayForName on JDK 1.2.2 on Solaris. Got this output: > > for clsName "int" expected "Exception" but got "int" > for clsName "[Lint;" expected "Exception" but got "int[]" > > Note that I had a (legit) class called 'int.class' in my current > directory when I ran the test. Loading 'int' does *not* load the > Integer class or the primitive 'int' class, but it *will* load some > arbitrary class called 'int'. So the explicit denial of 'int' classes > is wrong. > > -Pat
Re: large Class.forName() patch
On Sun, 30 Jan 2000, Archie Cobbs wrote: > > Mo DeJong writes: > > I have been working on getting Class.forName() working more like > > the Sun JDK and I have a patch that I would like to get some comments > > I don't understand this comment: > > + /* This is kind of strange, but Sun's implementation > + does not allow lookups like Class.forName("int"), > + so if this is a primitive type, we just pretend > + it could not be found. It would be handy to load > + "int" or "char" by name, but oh well */ Well, then why does Kaffe allow lookups by name? I thought it might be a handy feature but it is not something that is allowed in the Sun JDK. My patch removed the ability to do lookups like Class.forName("int") and Class.forName("void"), so that Kaffe would act more like the Sun JDK. I think your "Quick Take" on it was the same as mine, I just want to make sure that everyone agrees that such a lookup should not be allowed and that the patch I sent is the correct way to disallow them. If you like we can change the "This is kind of strange, but" text to "Testing shows that Sun's implementation ...". Mo DeJong Red Hat Inc. > Why would a JVM be expected to recognize a Java language keyword? > As far as the JVM is concerned, Java is just one of many higher > level languages that are capable of being compiled to bytecode > (or am I misreading something, this was just a quick take). > > -Archie > > ___ > Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com >
Re: large Class.forName() patch
Archie Cobbs wrote: > I don't understand this comment: > > + /* This is kind of strange, but Sun's implementation > + does not allow lookups like Class.forName("int"), > + so if this is a primitive type, we just pretend > + it could not be found. It would be handy to load > + "int" or "char" by name, but oh well */ > > Why would a JVM be expected to recognize a Java language keyword? > As far as the JVM is concerned, Java is just one of many higher > level languages that are capable of being compiled to bytecode > (or am I misreading something, this was just a quick take). But it could understand I, J etc - especially for [I,[[I and friends classes. Certainly 'int' is not right thing - but single letter codes are JVM thing. Is there any place where it is explicitly told if Class.forName supports arrays of primitives ? Artur
Re: large Class.forName() patch
Mo DeJong writes: > I have been working on getting Class.forName() working more like > the Sun JDK and I have a patch that I would like to get some comments I don't understand this comment: + /* This is kind of strange, but Sun's implementation + does not allow lookups like Class.forName("int"), + so if this is a primitive type, we just pretend + it could not be found. It would be handy to load + "int" or "char" by name, but oh well */ Why would a JVM be expected to recognize a Java language keyword? As far as the JVM is concerned, Java is just one of many higher level languages that are capable of being compiled to bytecode (or am I misreading something, this was just a quick take). -Archie ___ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
Re: large Class.forName() patch
> Hi all. > > I have been working on getting Class.forName() working more like > the Sun JDK and I have a patch that I would like to get some comments > on. I have attached a file called ArrayForName.java which tests > the Class.forName() method and compares the result to an expected > result. First, try running this test case on the current CVS version > of Kaffe to see how many failures you get. Then apply the patch > called forname.patch and check out how many problems it fixes. > > I feel good about almost all of the changes in the patch, except > for the first on (the one in file classMethod.c around line 1035). > > I am not sure if this would break JNI code, or if was just the wrong > place to make the change. Could someone who knows more about the > internals check it out and suggest a better change if there is one? > Wow, I had no idea that forName accepted some of the things you found. I hava noticed that it accepts [Ljava/lang/Object; as well as [Ljava.lang.Object; It doesn't look like you changed this, which is probably just as well. As for primitive types, I suspect that keeping them in the classpool is a bug in the first place. Why shouldn't a JVM accept a class who's name is a Java Language keyword? Jason
large Class.forName() patch
Hi all. I have been working on getting Class.forName() working more like the Sun JDK and I have a patch that I would like to get some comments on. I have attached a file called ArrayForName.java which tests the Class.forName() method and compares the result to an expected result. First, try running this test case on the current CVS version of Kaffe to see how many failures you get. Then apply the patch called forname.patch and check out how many problems it fixes. I feel good about almost all of the changes in the patch, except for the first on (the one in file classMethod.c around line 1035). I am not sure if this would break JNI code, or if was just the wrong place to make the change. Could someone who knows more about the internals check it out and suggest a better change if there is one? thanks Mo Dejong Red Hat Inc. public class ArrayForName { public static void testLoadArray() throws Exception { // Loading by built-in type ID is not allowed // int != I // boolean != Z // long != J // float!= F // double != D // byte != B // short!= S // char != C // void != V expect("I", "Exception"); expect("Z", "Exception"); expect("J", "Exception"); expect("F", "Exception"); expect("D", "Exception"); expect("B", "Exception"); expect("S", "Exception"); expect("C", "Exception"); expect("V", "Exception"); // Not possible to load by builtin type name expect("int", "Exception"); expect("boolean", "Exception"); expect("long","Exception"); expect("float", "Exception"); expect("double", "Exception"); expect("byte","Exception"); expect("short", "Exception"); expect("char","Exception"); expect("void","Exception"); // Test loading an array by built-in type id // int[]== [I // int[][] == [[I // boolean[]== [Z // boolean[][] == [[Z // long[] == [J // long[][] == [[J // float[] == [F // float[][]== [[F // double[] == [D // double[][] == [[D // byte[] == [B // byte[][] == [[B // short[] == [S // short[][]== [[S // char[] == [C // char[][] == [[C expect("[I", "int[]"); expect("[[I", "int[][]"); expect("[Z", "boolean[]"); expect("[[Z", "boolean[][]"); expect("[J", "long[]"); expect("[[J", "long[][]"); expect("[F", "float[]"); expect("[[F", "float[][]"); expect("[D", "double[]"); expect("[[D", "double[][]"); expect("[B", "byte[]"); expect("[[B", "byte[][]"); expect("[S", "short[]"); expect("[[S", "short[][]"); expect("[C", "char[]"); expect("[[C", "char[][]"); // Array of type void is not allowed expect("[V","Exception"); expect("[[V", "Exception"); expect("[[[V", "Exception"); // When loading an array using the built-in // type id, id must be at end of string expect("[II", "Exception"); expect("[ZZ", "Exception"); expect("[JJ", "Exception"); expect("[FF", "Exception"); expect("[DD", "Exception"); expect("[BB", "Exception"); expect("[SS", "Exception"); expect("[CC", "Exception"); expect("[ZZ", "Exception"); expect("[C;", "Exception"); expect("[C\0;", "Exception"); // [L + Class + ; // Primitive Class name is not valid expect("[Lint;", "Exception"); expect("[Lboolean;", "Exception"); expect("[Llong;","Exception"); expect("[Lfloat;", "Exception"); expect("[Ldouble;", "Exception"); expect("[Lbyte;","Exception"); expect("[Lshort;", "Exception"); expect("[Lchar;","Exception"); expect("[Lvoid;","Exception"); // java.lang.Object[] == [Ljava.lang.Object; // java.lang.Object[][] == [[Ljava.lang.Object; // java.lang.String[] == [Ljava.lang.String; // java.lang.String[][] == [[Ljava.lang.String; expect("[Ljava.lang.Object;", "java.lang.Object[]"); expect("[[Ljava.lang.Object;", "java.lang.Object[][]"); expect("[Ljava.lang.String;", "java.lang.String[]"); expect("[[Ljava.lang.String;", "java.lang.String[][]"); // L + Class must follow 0-N [ characters expect("Ljava.lang.Object;", "Exception"); expect("Ljava.lang.String;", "Exception"); // Misc invalid class nam