Re: large Class.forName() patch

2000-02-08 Thread Godmar Back



Just to make this clear.  I don't plan on doing anything.
I'm just throwing out ideas here.  But if I implemented it, I think
I would use such a flag to special-case regular strings.

Artur's numbers seem useful.  However, as you say, a full investigation 
would be needed to quantify the benefits gained from a utf8-based 
implementation.

> 
> > If somebody
> > implements full String/StringBuffer class using utf8 we could perform
> > some real tests 
> 
> I guess you mean full String using UCS2 (2 bytes per character)?  A UTF8
> implementation is what is there now.
> 

No, the current implementation uses only 2 bytes per character strings
for all java.lang.String objects.

Internally, we store some information as utf8s, but I think this discussion
was about how to best implement java.lang.String.

- Godmar



Re: large Class.forName() patch

2000-02-08 Thread Dan McGuirk


On Tue, 8 Feb 2000, Artur Biesiadowski wrote:
> > Also, I think that charAt() can be sped up for "regular" strings
> > that contain only ASCII chars < 127, because those can be represented in
> > a byte array.  So you have to ask what strings you're application is
> > dealing with.  That may depend on what i18n is used etc.
> 
> Yes, it will go faster as no decoding would have to be done for every
> separate byte, but do you plan using special flag for marking such
> Strings ? 

This is probably not a bad idea.  Strings are immutable, so you only have
to set the flag once when the string is created.  You will save a lot of
time with ASCII strings (by far the most common case) by using this one
extra bit of memory and checking it when you do the charAt.  On the other
hand, you don't want the extra time taken creating strings to outweigh the
benefit.

> If somebody
> implements full String/StringBuffer class using utf8 we could perform
> some real tests 

I guess you mean full String using UCS2 (2 bytes per character)?  A UTF8
implementation is what is there now.



Re: large Class.forName() patch

2000-02-08 Thread Artur Biesiadowski


Godmar Back wrote:
> 

> Even for runtime efficiency, you need to look at profiling data and
> decide whether it's worth it.  Is charAt() really the most frequent
> operation on java.lang.String?  For which application or set of
> applications?
> 

I had no time to implement full String using utf8, so I emulated slower
utf8 behaviour by adding loop for normal charAt, which traverses char
array from start to point of check, compares if it is greater than 127
and if yes assigns it to itself. Of course it is quite stupid opeartion
but I think it should be something in order of decoding utf8 (and same
as there there is only one 'if' and no more operations per character if
it is < 127). I don't suppose that working on bytes nor charater will
change speed here, so it should be good approximate.

I've changed only String.charAt - there will be more functions that need
to be changed possibly slowing it down further.

I've choosen quite String intensive application - I was compiling
SwingSet with kjc. Test was done under JDK 1.2v1 for linux with default
sun jit. With normal String kjc compiled everything in about 11 seconds
(measured by time command, user time - together with startup, dynamic
linking of libraries et all). With slower String it took slightly over
14.5 seconds.

As to how common is String.charAt - I've changed tya to count all
invokes and invokes to String.charAt at runtime. I've run SwingSet demo
and then clicked on every tab, but without meddling with components.
Ratio String.charAt/all calls was 309108/8611457. So about 3,5% of all
java calls in graphics heavy application is String.charAt. It is not
something to ignore - but making it considerably slower you impact all
java applications in visible manner - and remeber that it is with only
single method changed.


> Also, I think that charAt() can be sped up for "regular" strings
> that contain only ASCII chars < 127, because those can be represented in
> a byte array.  So you have to ask what strings you're application is
> dealing with.  That may depend on what i18n is used etc.

Yes, it will go faster as no decoding would have to be done for every
separate byte, but do you plan using special flag for marking such
Strings ? If not, you will have to traverse it one by one anyway. 

> I don't think one should rush to conclusions without data to back it up.
> It may or may not be a benefit, and most likely it will be sometimes.

No I had no data at time I wrote last mail and even currently I've done
only two small tests, but I have a general impression that String
performance is one of most important things for speed of java. I'm sure
I have reads few articles about it - unfortunately no links, so you can
suppose it is just my impression.


At last, I would like to agree with you that there are some ways in
which such implementation could perform better - memory usage, faster
copying and of course integration with internals of vm (which was a
reason for it in first place looking at this thread). If somebody
implements full String/StringBuffer class using utf8 we could perform
some real tests - what I'm trying to say is that speed of String,
especiall charAt is something to really care about - it is one of
hotspots for EVERY (or almost every:) java application.


Artur



Re: large Class.forName() patch

2000-02-07 Thread Godmar Back


> 
> 
> Godmar Back wrote:
> 
> > Thinking about it, maybe it would even be worth thinking about
> > not implementing Strings as char[] arrays internally, but as utf8strings.
> > This would save space for all ASCII strings, it would allows to directly
> > use a literal's utf8 representation.  Since strings are immutable, the
> > internal char[] array is never directly exposed anyway.
> 
> It is done this way in gcj I think. But I do not suppose that it is a
> good thing for java only implementation - main aim for Strings should be
> efficiency. I think that String.charAt is method called often enough to
> cause major performance drop in entire java app with such strings.
> 

We may be talking about different things here, but I don't think gcj
stores strings as utf8strings in the way I propose.
Look at their implementation of charAt(), which is mapped to
"(jchar*)((char*) str->data + str->boffset);" via JvGetStringChars.

Saying that efficiency should be the main aim for Java begs the question
what kind of efficiency: you imply runtime efficiency, but space efficiency
can sometimes be more important.

Even for runtime efficiency, you need to look at profiling data and 
decide whether it's worth it.  Is charAt() really the most frequent
operation on java.lang.String?  For which application or set of 
applications?

Also, I think that charAt() can be sped up for "regular" strings
that contain only ASCII chars < 127, because those can be represented in
a byte array.  So you have to ask what strings you're application is
dealing with.  That may depend on what i18n is used etc.

I don't think one should rush to conclusions without data to back it up.
It may or may not be a benefit, and most likely it will be sometimes.

- Godmar

ps: I think Alexandre is right about class names not being able to
contain a \u0:  while they're represented as a ConstantUtf8Info, the
spec also says 

Class names that appear in class file structures are always represented 
in a fully qualified form (§2.7.9).  These class names are always 
represented as CONSTANT_Utf8_info (§4.4.7) structures, and they are
referenced from those CONSTANT_NameAndType_info (§4.4.6) structures 
that have class names as part of their descriptor (§4.3), as well as 
from all CONSTANT_Class_info (§4.4.1) structures. 

For historical reasons the exact syntax of fully qualified class names 
that appear in class file structures differs from the familiar Java fully 
qualified class name documented in §2.7.9. In the internal form, the
ASCII periods ('.') that normally separate the identifiers (§2.2) 
that make up the fully qualified name are replaced by ASCII forward slashes 
('/'). For example, the normal fully qualified name of class Thread is
java.lang.Thread. In the form used in descriptors in class files, a 
reference to the name of class Thread is implemented using a 
CONSTANT_Utf8_info structure representing the string "java/lang/Thread"

And if you follow the link the identifiers, it'll says that it's a sequence
of isJavaLetterOrDigit chars, which are letters, digit, _ and $.

That said, it doesn't solve the problem that we must throw an exception
if the user looks up "java.lang.String\0" or the like.



Re: large Class.forName() patch

2000-02-07 Thread Artur Biesiadowski


Godmar Back wrote:

> Thinking about it, maybe it would even be worth thinking about
> not implementing Strings as char[] arrays internally, but as utf8strings.
> This would save space for all ASCII strings, it would allows to directly
> use a literal's utf8 representation.  Since strings are immutable, the
> internal char[] array is never directly exposed anyway.

It is done this way in gcj I think. But I do not suppose that it is a
good thing for java only implementation - main aim for Strings should be
efficiency. I think that String.charAt is method called often enough to
cause major performance drop in entire java app with such strings.

Artur



Re: large Class.forName() patch

2000-02-06 Thread Godmar Back


> 
> 
> On Feb  4, 2000, Mo DeJong <[EMAIL PROTECTED]> wrote:
> 
> > The current Kaffe implementation means that a \0 embedded in the
> > class name will screw up code that expects a \0 at the end of a
> > string.
> 
> AFAIK, there can't be \0s embedded in class names.  

Utf8strings can have \u0 characters in them, and the JVM spec says that 
the class name is a CONSTANT_Utf8Info, so it can have a \u0 in it as well.
Why should it not?  Limitations on characters in your identifier make
sense at the java source level, but they're not necessary at the 
bytecode level.

> And, in any case,
> utf8 strings won't contain a \0 char unless they actually contain \u0.
> 

Not so.  On the contrary, utf8strings encode the \u0 with something
different than \0.  The convenience of utf8 in the JVM spec lies in that 
it can encode all unicode character without using a \0.  Hence, they can 
be represented as zero-terminated strings, which is what we do.

The problem comes in when we convert either 16bit unicode java strings
or utf8strings to zero-terminated 8bit C strings.  Then we can't represent
the \u0 character (among others), and problems arise.  If we directly 
converted from unicode java to utf8strings, there would be no problem.
Thinking about it, maybe it would even be worth thinking about 
not implementing Strings as char[] arrays internally, but as utf8strings.
This would save space for all ASCII strings, it would allows to directly
use a literal's utf8 representation.  Since strings are immutable, the
internal char[] array is never directly exposed anyway.

- Godmar



Re: large Class.forName() patch

2000-02-06 Thread Alexandre Oliva


On Feb  4, 2000, Mo DeJong <[EMAIL PROTECTED]> wrote:

> The current Kaffe implementation means that a \0 embedded in the
> class name will screw up code that expects a \0 at the end of a
> string.

AFAIK, there can't be \0s embedded in class names.  And, in any case,
utf8 strings won't contain a \0 char unless they actually contain \u0.

-- 
Alexandre Oliva http://www.ic.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
oliva@{lsd.ic.unicamp.br,guarana.{org,com}} aoliva@{acm,computer}.org
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
** I may forward mail about projects to mailing lists; please use them



Re: large Class.forName() patch

2000-02-04 Thread Godmar Back


> 
> 
> > 
> > 
> > On Thu, 3 Feb 100, Godmar Back wrote:
> > 
> > Ok, that sounds like a plan. The only other problem I noticed is that
> > Kaffe seems to use plain char* types for class names. These class names
> > really should be in unicode strings or at least UTF8 strings. The
> > current Kaffe implementation means that a \0 embedded in the class name
> > will screw up code that expects a \0 at the end of a string. One
> > of the regression test checks for this.
> > 
> 
> We should probably keep all descriptors/identifiers that were derived 
> from uft8consts in the constant pool as utf8consts all the time.
> 

Actually, I was a bit too fast with my reply.  Fortunately,
we already use utf8consts to store the class names internally.

The problem comes in when we convert from a java string to a C string
to a utf8 string.  Then we do seem to lose such strings.  This is exactly
what we do in java_lang_Class_forName.  What we need is a method to
convert Java strings to utf8consts without going through C strings.

- Godmar



Re: large Class.forName() patch

2000-02-04 Thread Godmar Back


> 
> 
> On Thu, 3 Feb 100, Godmar Back wrote:
> 
> Ok, that sounds like a plan. The only other problem I noticed is that
> Kaffe seems to use plain char* types for class names. These class names
> really should be in unicode strings or at least UTF8 strings. The
> current Kaffe implementation means that a \0 embedded in the class name
> will screw up code that expects a \0 at the end of a string. One
> of the regression test checks for this.
> 

We should probably keep all descriptors/identifiers that were derived 
from uft8consts in the constant pool as utf8consts all the time.

- Godmar



Re: large Class.forName() patch

2000-02-03 Thread Mo DeJong


On Thu, 3 Feb 100, Godmar Back wrote:

Ok, that sounds like a plan. The only other problem I noticed is that
Kaffe seems to use plain char* types for class names. These class names
really should be in unicode strings or at least UTF8 strings. The
current Kaffe implementation means that a \0 embedded in the class name
will screw up code that expects a \0 at the end of a string. One
of the regression test checks for this.

>  I'm working on it.  We'll first try renaming the basic types internally
> w/o breaking gcj (Jason sent me a patch for that), and see where that gets 
> us, and then the rest.  I think we should be able to match Sun's output
> exactly.  Unfortunately, egcs broke on me right after I checked it out
> today so waiting on that to be fixed before I can do that.

You can add it to Kaffe's regression tests but I am planning on turning
it into a Mauve test so you should not really need to add it.

Mo Dejong
Red Hat Inc.

> Thanks very much for your regression test.
> 
>   - Godmar
> 
> > 
> > It sounds like there is still some debate as to how the
> > primitive classes should be searched for by name, so could
> > we agree on this trimmed down patch that does not include
> > the part that disables lookups for "int" and such?
> > 
> 



Re: large Class.forName() patch

2000-02-03 Thread Godmar Back



 I'm working on it.  We'll first try renaming the basic types internally
w/o breaking gcj (Jason sent me a patch for that), and see where that gets 
us, and then the rest.  I think we should be able to match Sun's output
exactly.  Unfortunately, egcs broke on me right after I checked it out
today so waiting on that to be fixed before I can do that.
Thanks very much for your regression test.

- Godmar

> 
> It sounds like there is still some debate as to how the
> primitive classes should be searched for by name, so could
> we agree on this trimmed down patch that does not include
> the part that disables lookups for "int" and such?
> 



Re: large Class.forName() patch

2000-02-03 Thread Archie Cobbs


Mo DeJong writes:
> It sounds like there is still some debate as to how the
> primitive classes should be searched for by name, so could
> we agree on this trimmed down patch that does not include
> the part that disables lookups for "int" and such?

Minor nit: it seems clearer to do this version of your first
itypes.c patch..

-Archie

___
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com

Index: itypes.c
===
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/itypes.c,v
retrieving revision 1.17
diff -u -r1.17 itypes.c
--- itypes.c1999/11/29 23:44:10 1.17
+++ itypes.c2000/02/03 23:49:21
@@ -131,23 +131,53 @@
 Hjava_lang_Class*
 classFromSig(const char** strp, Hjava_lang_ClassLoader* loader, errorInfo *einfo)
 {
-   Hjava_lang_Class* cl;
+   Hjava_lang_Class* cl = NULL;
Utf8Const* utf8;
const char* start;
const char* end;
 
-   switch (*(*strp)++) {
-   case 'V': return (voidClass);
-   case 'I': return (intClass);
-   case 'Z': return (booleanClass);
-   case 'S': return (shortClass);
-   case 'B': return (byteClass);
-   case 'C': return (charClass);
-   case 'F': return (floatClass);
-   case 'D': return (doubleClass);
-   case 'J': return (longClass);
-   case '[': return (lookupArray(classFromSig(strp, loader, einfo),
- einfo));
+   /* Check for primitive types */
+   switch (*(*strp)) {
+   case 'V':
+   cl = voidClass;
+   break;
+   case 'I':
+   cl = intClass;
+   break;
+   case 'Z':
+   cl = booleanClass;
+   break;
+   case 'S':
+   cl = shortClass;
+   break;
+   case 'B':
+   cl = byteClass;
+   break;
+   case 'C':
+   cl = charClass;
+   break;
+   case 'F':
+   cl = floatClass;
+   break;
+   case 'D':
+   cl = doubleClass;
+   break;
+   case 'J':
+   cl = longClass;
+   break;
+   }
+   if (cl != NULL) {   /* Any trailing junk is an error */
+   if (*++(*strp) != '\0') {
+   postException(einfo, JAVA_LANG(VerifyError));
+   return (NULL);
+   }
+   return (cl);
+   }
+
+   /* Non-primitive types */
+   switch (*(*strp)++) {
+   case '[':
+   return (lookupArray(classFromSig(strp, loader, einfo), einfo));
case 'L':
start = *strp;
for (end = start; *end != 0 && *end != ';'; end++)



Re: large Class.forName() patch

2000-02-03 Thread Mo DeJong

It sounds like there is still some debate as to how the
primitive classes should be searched for by name, so could
we agree on this trimmed down patch that does not include
the part that disables lookups for "int" and such?

After lookups for "int" and friends are fixed the right
way (assuming there is agreement on what that way is),
will the following test still work? If we can not search
by a class name like "void" then will there be macro
tests like CLASS_ISVOID() CLASS_ISINT() added so we can
do such tests?

if (strcmp(CLASS_CNAME(c), "void") == 0) {
  // Do Something
}

Mo Dejong
Red Hat Inc.


Index: kaffe/kaffevm/classMethod.c
===
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/classMethod.c,v
retrieving revision 1.75
diff -u -r1.75 classMethod.c
--- kaffe/kaffevm/classMethod.c 2000/01/19 11:04:31 1.75
+++ kaffe/kaffevm/classMethod.c 2000/02/03 23:06:12
@@ -1110,12 +1110,13 @@
}
class = loadClass(utf8, NULL, einfo);
utf8ConstRelease(utf8);
+
if (class != 0) {
if (processClass(class, CSTATE_COMPLETE, einfo) == true) {
return (class);
}
}
-   return (0);
+   return (NULL);
 }
 
 /*
@@ -2281,12 +2282,18 @@
/* If we couldn't resolve the element type, there's no way we can
 * construct the array type.
 */
-   if (c == 0) {
-   return (0);
+   if (c == NULL) {
+   return (NULL);
}
 
/* Build signature for array type */
if (CLASS_IS_PRIMITIVE (c)) {
+   /* An array of type void is not allowed */
+   if (strcmp(CLASS_CNAME(c), "void") == 0) {
+   postException(einfo, JAVA_LANG(VerifyError));
+   return (NULL);
+   }
+
arr_class = CLASS_ARRAY_CACHE(c);
if (arr_class) {
return (arr_class);
@@ -2304,12 +2311,12 @@
arr_name = utf8ConstNew(sig, -1);   /* release before returning */
if (!arr_name) {
postOutOfMemory(einfo);
-   return 0;
+   return (NULL);
}
centry = lookupClassEntry(arr_name, c->loader, einfo);
if (centry == 0) {
utf8ConstRelease(arr_name);
-   return (0);
+   return (NULL);
}
 
if (centry->class != 0) {
Index: kaffe/kaffevm/itypes.c
===
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/itypes.c,v
retrieving revision 1.17
diff -u -r1.17 itypes.c
--- kaffe/kaffevm/itypes.c  1999/11/29 23:44:10 1.17
+++ kaffe/kaffevm/itypes.c  2000/02/03 23:06:12
@@ -131,21 +131,49 @@
 Hjava_lang_Class*
 classFromSig(const char** strp, Hjava_lang_ClassLoader* loader, errorInfo *einfo)
 {
-   Hjava_lang_Class* cl;
+   Hjava_lang_Class* cl = NULL;
Utf8Const* utf8;
const char* start;
const char* end;
 
switch (*(*strp)++) {
-   case 'V': return (voidClass);
-   case 'I': return (intClass);
-   case 'Z': return (booleanClass);
-   case 'S': return (shortClass);
-   case 'B': return (byteClass);
-   case 'C': return (charClass);
-   case 'F': return (floatClass);
-   case 'D': return (doubleClass);
-   case 'J': return (longClass);
+   case 'V':
+   if (cl == NULL)
+   cl = voidClass;
+   case 'I':
+   if (cl == NULL)
+   cl = intClass;
+   case 'Z':
+   if (cl == NULL)
+   cl = booleanClass;
+   case 'S':
+   if (cl == NULL)
+   cl = shortClass;
+   case 'B':
+   if (cl == NULL)
+   cl = byteClass;
+   case 'C':
+   if (cl == NULL)
+   cl = charClass;
+   case 'F':
+   if (cl == NULL)
+   cl = floatClass;
+   case 'D':
+   if (cl == NULL)
+   cl = doubleClass;
+   case 'J':
+   if (cl == NULL)
+   cl = longClass;
+
+   if (cl != NULL) {
+   /* If build in type char is not at the end of the string, malformed 
+signature */
+   if (*(*strp) != 0) {
+   postException(einfo, JAVA_LANG(VerifyError));
+   return (NULL);
+   }
+   }
+
+   return (cl);
case '[': return (lookupArray(classFromSig(strp, loader, einfo),
  einfo));
case 'L':
@@ -159,11 +187,17 @@
utf8 = utf8ConstNew(start, end - start);
if (!utf8) {
postOutOfMemory(einfo);
-   return 0;
+   return (NULL);
}
c

Re: large Class.forName() patch

2000-02-01 Thread Mo DeJong

On that note, here is an updated regression test that takes care
of the case where you have an actual Class name "int" or "I" and
it also checks for / in the name of a class passed to forName().

Mo Dejong
Red Hat Inc.

On Tue, 1 Feb 2000, Godmar Back wrote:

> Let me look at Mo's patch more closely and maybe we find something
> that will not break the gcj stuff.  I kind of like to keep the primitive
> classes in the classpool---this I think is nicer for keeping statistics
> on all classes, for instance.
> 
> As for the Class.forName() interface, I think it's better to strive 
> for 100% Sun compatibility here, which means to not provide backdoor 
> ways to lookup primitive classes such as looking up "I" or "int" if 
> Sun doesn't.  For instance, I got totally 
> confused already when I thought that kaffe could look up "I" before I 
> realized that's because I do all my tests in a directory filled with 
> A.class, B.class, ..., I.class, ... Z.class files.
> 
> That said, I find Sun's implementation of Class.forName horrible.
> Why do I have use the L; form in an array but not for simple classes?
> It makes no sense.  Of course, the JDK doc also doesn't document it.
> And why shouldn't you be able to look primitive classes up by name?
> 
>   - Godmar



public class ArrayForName {

public static void testLoadArray() throws Exception {

// Loading by built-in type ID is not allowed
// int  != I
// boolean  != Z
// long != J
// float!= F
// double   != D
// byte != B
// short!= S
// char != C
// void != V

expect("I", "Exception");
expect("Z", "Exception");
expect("J", "Exception");
expect("F", "Exception");
expect("D", "Exception");
expect("B", "Exception");
expect("S", "Exception");
expect("C", "Exception");
expect("V", "Exception");

// Not possible to load by builtin type name

expect("int", "Exception");
expect("boolean", "Exception");
expect("long","Exception");
expect("float",   "Exception");
expect("double",  "Exception");
expect("byte","Exception");
expect("short",   "Exception");
expect("char","Exception");
expect("void","Exception");

// Test loading an array by built-in type id
// int[]== [I
// int[][]  == [[I
// boolean[]== [Z
// boolean[][]  == [[Z
// long[]   == [J
// long[][] == [[J
// float[]  == [F
// float[][]== [[F
// double[] == [D
// double[][]   == [[D
// byte[]   == [B
// byte[][] == [[B
// short[]  == [S
// short[][]== [[S
// char[]   == [C
// char[][] == [[C

expect("[I",  "int[]");
expect("[[I", "int[][]");
expect("[Z",  "boolean[]");
expect("[[Z", "boolean[][]");
expect("[J",  "long[]");
expect("[[J", "long[][]");
expect("[F",  "float[]");
expect("[[F", "float[][]");
expect("[D",  "double[]");
expect("[[D", "double[][]");
expect("[B",  "byte[]");
expect("[[B", "byte[][]");
expect("[S",  "short[]");
expect("[[S", "short[][]");
expect("[C",  "char[]");
expect("[[C", "char[][]");

// Array of type void is not allowed

expect("[V","Exception");
expect("[[V",   "Exception");
expect("[[[V",  "Exception");

// When loading an array using the built-in
// type id, id must be at end of string

expect("[II",   "Exception");
expect("[ZZ",   "Exception");
expect("[JJ",   "Exception");
expect("[FF",   "Exception");
expect("[DD",   "Exception");
expect("[BB",   "Exception");
expect("[SS",   "Exception");
expect("[CC",   "Exception");
expect("[ZZ",   "Exception");
expect("[C;",   "Exception");
expect("[C\0;", "Exception");

// [L + Class + ;
// Primitive Class name is not valid 

expect("[Lint;", "Exception");
expect("[Lboolean;", "Exception");
expect("[Llong;","Exception");
expect("[Lfloat;",   "Exception");
expect("[Ldouble;",  "Exception");
expect("[Lbyte;","Exception");
expect("[Lshort;",   "Exception");
expect("[Lchar;","Exception");
expect("[Lvoid;","Exception");

// java.lang.Object[] == [Ljava.lang.Object;
// java.lang.Object[][]   == [[Ljava.lang.Object;
// java.lang.String[] == [Ljava.lang.String;
// java.lang.String[][]   == [[Ljava.lang.String;

expe

Re: large Class.forName() patch

2000-02-01 Thread Jason Baker


> 
> > 
> > Jason Baker wrote:
> > > ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", "");
> > 
> > That code is actually commented out already.  Godmar will have to
> > chime in on its intent.
> > 
> 
> At first, I thought I just fake the java::lang::Class _Jv_intClass structure
> as a big bss symbol.  Then I realized this doesn't work, and the structure
> is now defined in gcj-class.c
> 
> > > ./kaffe/kaffevm/itypes.c:   initPrimClass(&intClass, "int", 'I', 4);
> > 
> > I suggest using some completely bogus, yet human-readable name in this
> > context.  The primitive types should never be looked up by name, so
> > "[int]" or ";primitive int" or ... hmm, looking in the VM spec (for

I like ";int".  (But not on that line, getName() should still work.)

> > invalid name ideas), I noticed actually it says in S2.2 Identifiers:
> > "An identifier must not be ... a keyword in the Java programming
> > language."  So, maybe a class with name 'int' is bogus  Something
> > that most JVMs don't seem to check, it seems.
> > 
> > > ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = 
>Class.getPrimitiveClass("int");
> > 
> > As Jason pointed out, this is an internal hack for Kaffe.  The
> > argument could just as easily be Class.PRIMITIVE_INT_MAGIC
> > 

I said nothing of the kind.  Since getPrimitiveClass doesn't look the
name up in any hashtable, no PRIMITIVE_INT_MAGIC is needed.  But, it
could be changed to do { return forName0(';' + name); }, then the
check for weird primitive classes could be done in java code with
name.charAt(0) == ';'.


> That said, I find Sun's implementation of Class.forName horrible.
> Why do I have use the L; form in an array but not for simple classes?

I'd like to know why the L notation uses dots instead of slashes.

> It makes no sense.  Of course, the JDK doc also doesn't document it.
> And why shouldn't you be able to look primitive classes up by name?

Because they aren't bytecode keywords!

Jason



Re: large Class.forName() patch

2000-02-01 Thread Godmar Back


> 
> Jason Baker wrote:
> > ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", "");
> 
> That code is actually commented out already.  Godmar will have to
> chime in on its intent.
> 

At first, I thought I just fake the java::lang::Class _Jv_intClass structure
as a big bss symbol.  Then I realized this doesn't work, and the structure
is now defined in gcj-class.c

> > ./kaffe/kaffevm/itypes.c:   initPrimClass(&intClass, "int", 'I', 4);
> 
> I suggest using some completely bogus, yet human-readable name in this
> context.  The primitive types should never be looked up by name, so
> "[int]" or ";primitive int" or ... hmm, looking in the VM spec (for
> invalid name ideas), I noticed actually it says in S2.2 Identifiers:
> "An identifier must not be ... a keyword in the Java programming
> language."  So, maybe a class with name 'int' is bogus  Something
> that most JVMs don't seem to check, it seems.
> 
> > ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = 
>Class.getPrimitiveClass("int");
> 
> As Jason pointed out, this is an internal hack for Kaffe.  The
> argument could just as easily be Class.PRIMITIVE_INT_MAGIC
> 

I think this sounds about right.

gcj still uses the same scheme by giving the primitive classes
internally the names "int" etc.  See DECLARE_PRIM_TYPE in 
libgcj/libjava/prims.cc

So if you changed that, we'd have to map it in gcj/gcj-class.c.
I do remember that I needed the "int" etc. in the class pool for
gcj code to work; I don't remember the details now.  It did fix something.

Note that Class.getPrimitiveClass does not appear to be an exported 
function, but a function that's package-internal to java.lang, presumably
just to implement the Number.TYPE initializations.  I think it may date back
when Kaffe used Sun's classes.zip, which explains the native implementation
of it.

I think Pat is right that the use of "int" in the old kaffe code
in java/lang/*.java and their inclusion in the class pool for gcj 
support are unrelated.

Let me look at Mo's patch more closely and maybe we find something
that will not break the gcj stuff.  I kind of like to keep the primitive
classes in the classpool---this I think is nicer for keeping statistics
on all classes, for instance.

As for the Class.forName() interface, I think it's better to strive 
for 100% Sun compatibility here, which means to not provide backdoor 
ways to lookup primitive classes such as looking up "I" or "int" if 
Sun doesn't.  For instance, I got totally 
confused already when I thought that kaffe could look up "I" before I 
realized that's because I do all my tests in a directory filled with 
A.class, B.class, ..., I.class, ... Z.class files.

That said, I find Sun's implementation of Class.forName horrible.
Why do I have use the L; form in an array but not for simple classes?
It makes no sense.  Of course, the JDK doc also doesn't document it.
And why shouldn't you be able to look primitive classes up by name?

- Godmar



Re: large Class.forName() patch

2000-01-31 Thread Patrick Tullmann


Jason Baker wrote:
> ./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", "");

That code is actually commented out already.  Godmar will have to
chime in on its intent.

> ./kaffe/kaffevm/itypes.c:   initPrimClass(&intClass, "int", 'I', 4);

I suggest using some completely bogus, yet human-readable name in this
context.  The primitive types should never be looked up by name, so
"[int]" or ";primitive int" or ... hmm, looking in the VM spec (for
invalid name ideas), I noticed actually it says in S2.2 Identifiers:
"An identifier must not be ... a keyword in the Java programming
language."  So, maybe a class with name 'int' is bogus  Something
that most JVMs don't seem to check, it seems.

> ./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = 
>Class.getPrimitiveClass("int");

As Jason pointed out, this is an internal hack for Kaffe.  The
argument could just as easily be Class.PRIMITIVE_INT_MAGIC

-Pat

- -  ---  ---  --   --  - -   -
Pat Tullmann   [EMAIL PROTECTED]
  That which does not kill you just didn't try hard enough.



Re: large Class.forName() patch

2000-01-31 Thread Jason Baker


Archie Cobbs <[EMAIL PROTECTED]> writes:

> Jason Baker writes:
> > >   #define INTEGER_PRIMITIVE_CLASS "I"
> > > 
> > > and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work?
> > 
> > No, its even worse, since a class in the anonymous package is alot
> > more likely to be called "I" than "int".  We can use "I" only where a
> > class would be L;.  
> 
> Good point.. what about
> 
>   #define INTEGER_PRIMITIVE_CLASS ";I"
>   #define BOOLEAN_PRIMITIVE_CLASS ";Z"
> 

That works, but why are we looking up primitve types by name anyway?

nephi(14) find . -type f -print | xargs grep '"int"'
./developers/fixup.c:addSymbol("_Jv_intClass", CLASS, "int", "", "");
./kaffe/kaffevm/itypes.c:* Note that a user-defined "class int" is impossible 
cause "int"
./kaffe/kaffevm/itypes.c:   initPrimClass(&intClass, "int", 'I', 4);
./libraries/javalib/java/lang/Integer.java: final public static Class TYPE = 
Class.getPrimitiveClass("int");

getPrimitveClass does not look up types by name, so the only problem
is in developers/fixup.c.  Why can't it just translate _Jv_intClass to
intClass?

Jason



Re: large Class.forName() patch

2000-01-31 Thread Archie Cobbs


Jason Baker writes:
> >   #define INTEGER_PRIMITIVE_CLASS   "I"
> > 
> > and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work?
> 
> No, its even worse, since a class in the anonymous package is alot
> more likely to be called "I" than "int".  We can use "I" only where a
> class would be L;.  

Good point.. what about

  #define INTEGER_PRIMITIVE_CLASS   ";I"
  #define BOOLEAN_PRIMITIVE_CLASS   ";Z"

etc.. ?

___
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com



Re: large Class.forName() patch

2000-01-31 Thread Jason Baker


Archie Cobbs <[EMAIL PROTECTED]> writes:

> 
>   #define INTEGER_PRIMITIVE_CLASS "I"
> 
> and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work?

No, its even worse, since a class in the anonymous package is alot
more likely to be called "I" than "int".  We can use "I" only where a
class would be L;.  

Jason



Re: large Class.forName() patch

2000-01-31 Thread Jason Baker


Mo DeJong <[EMAIL PROTECTED]> writes:

> Ok, if you have a class named "int" then the JVM should load
> that when you call forName("int"). I am not sure what Kaffe
> would do in this case because it already uses the name "int"
> for the primitive class "I". I think the best way to make
> the regression test check for this is to ensure that
> a loaded class (like "int") is not a primitive type.

ClassMethod.c is not to blame, the problem lies in itypes.c.

What happens when GCJ code is loaded after the class int?  You can
solve the forName problem by backing out this change and finding
another way to expose primitive types to GCJ code.  Since GCJ support
doesn't actually work, backing out the change along is another option.

Index: itypes.c
===
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/itypes.c,v
retrieving revision 1.16
retrieving revision 1.17
diff -c -r1.16 -r1.17
*** itypes.c1999/11/04 20:29:12 1.16
--- itypes.c1999/11/29 23:44:10 1.17
***
*** 36,42 
--- 36,45 
  void
  initPrimClass(Hjava_lang_Class** class, char* name, char sig, int len)
  {
+   errorInfo info;
+   classEntry* centry;
Hjava_lang_Class* clazz = newClass();
+ 
if (clazz == 0) {
goto bad;
}
***
*** 56,61 
--- 59,78 
TYPE_PRIM_SIZE(clazz) = len;
/* prevent any attempt to process those in processClass */
clazz->state = CSTATE_COMPLETE;
+ 
+   /* Add primitive types to the class pool as well 
+* This allows us to find them by name---for instance from gcj code.
+*
+* Note that a user-defined "class int" is impossible cause "int"
+* is a keyword.
+*/
+   centry = lookupClassEntry(clazz->name, 0, &info);
+   if (centry == 0) {
+   goto bad;
+   }
+   clazz->centry = centry;
+   centry->class = clazz;
+ 
return;
  bad:
fprintf(stderr, "not enough memory to run kaffe\n");


Jason



Re: large Class.forName() patch

2000-01-31 Thread Archie Cobbs


Mo DeJong writes:
> Ok, if you have a class named "int" then the JVM should load
> that when you call forName("int"). I am not sure what Kaffe
> would do in this case because it already uses the name "int"
> for the primitive class "I". I think the best way to make
> the regression test check for this is to ensure that
> a loaded class (like "int") is not a primitive type.

So we should fix kaffe internally. It shouldn't use "int" as
an internal name for the integer primitive class, because that
clashes with int.class; instead, let's use "I" like we're supposed
to, and maybe define a macro:

  #define INTEGER_PRIMITIVE_CLASS   "I"

and use INTEGER_PRIMITIVE_CLASS instead of "int".. would this work?

-Archie

___
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com



Re: large Class.forName() patch

2000-01-31 Thread Mo DeJong


Ok, if you have a class named "int" then the JVM should load
that when you call forName("int"). I am not sure what Kaffe
would do in this case because it already uses the name "int"
for the primitive class "I". I think the best way to make
the regression test check for this is to ensure that
a loaded class (like "int") is not a primitive type.

Mo Dejong
Red Hat Inc.

On Mon, 31 Jan 2000, Patrick Tullmann wrote:

> I ran ArrayForName on JDK 1.2.2 on Solaris.  Got this output:
> 
> for clsName "int" expected "Exception" but got "int"
> for clsName "[Lint;" expected "Exception" but got "int[]"
> 
> Note that I had a (legit) class called 'int.class' in my current
> directory when I ran the test.  Loading 'int' does *not* load the
> Integer class or the primitive 'int' class, but it *will* load some
> arbitrary class called 'int'.  So the explicit denial of 'int' classes
> is wrong.
> 
> -Pat




Re: large Class.forName() patch

2000-01-31 Thread Mo DeJong


On Sun, 30 Jan 2000, Archie Cobbs wrote:

> 
> Mo DeJong writes:
> > I have been working on getting Class.forName() working more like
> > the Sun JDK and I have a patch that I would like to get some comments
> 
> I don't understand this comment:
> 
> +   /* This is kind of strange, but Sun's implementation
> +  does not allow lookups like Class.forName("int"),
> +  so if this is a primitive type, we just pretend
> +  it could not be found. It would be handy to load
> +  "int" or "char" by name, but oh well */

Well, then why does Kaffe allow lookups by name? I thought it might
be a handy feature but it is not something that is allowed in the
Sun JDK. My patch removed the ability to do lookups like
Class.forName("int") and Class.forName("void"), so that Kaffe
would act more like the Sun JDK. I think your "Quick Take" on
it was the same as mine, I just want to make sure that everyone
agrees that such a lookup should not be allowed and that the
patch I sent is the correct way to disallow them. If you like we
can change the "This is kind of strange, but" text to
"Testing shows that Sun's implementation ...".

Mo DeJong
Red Hat Inc.

> Why would a JVM be expected to recognize a Java language keyword?
> As far as the JVM is concerned, Java is just one of many higher
> level languages that are capable of being compiled to bytecode
> (or am I misreading something, this was just a quick take).
> 
> -Archie
> 
> ___
> Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com
> 



Re: large Class.forName() patch

2000-01-31 Thread Artur Biesiadowski


Archie Cobbs wrote:

> I don't understand this comment:
> 
> +   /* This is kind of strange, but Sun's implementation
> +  does not allow lookups like Class.forName("int"),
> +  so if this is a primitive type, we just pretend
> +  it could not be found. It would be handy to load
> +  "int" or "char" by name, but oh well */
> 
> Why would a JVM be expected to recognize a Java language keyword?
> As far as the JVM is concerned, Java is just one of many higher
> level languages that are capable of being compiled to bytecode
> (or am I misreading something, this was just a quick take).


But it could understand I, J etc - especially for [I,[[I and friends
classes. Certainly 'int' is not right thing - but single letter codes
are JVM thing.

Is there any place where it is explicitly told if Class.forName supports
arrays of primitives ?

Artur



Re: large Class.forName() patch

2000-01-30 Thread Archie Cobbs


Mo DeJong writes:
> I have been working on getting Class.forName() working more like
> the Sun JDK and I have a patch that I would like to get some comments

I don't understand this comment:

+   /* This is kind of strange, but Sun's implementation
+  does not allow lookups like Class.forName("int"),
+  so if this is a primitive type, we just pretend
+  it could not be found. It would be handy to load
+  "int" or "char" by name, but oh well */

Why would a JVM be expected to recognize a Java language keyword?
As far as the JVM is concerned, Java is just one of many higher
level languages that are capable of being compiled to bytecode
(or am I misreading something, this was just a quick take).

-Archie

___
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com



Re: large Class.forName() patch

2000-01-29 Thread Jason Baker


> Hi all.
> 
> I have been working on getting Class.forName() working more like
> the Sun JDK and I have a patch that I would like to get some comments
> on. I have attached a file called ArrayForName.java which tests
> the Class.forName() method and compares the result to an expected
> result. First, try running this test case on the current CVS version
> of Kaffe to see how many failures you get. Then apply the patch
> called forname.patch and check out how many problems it fixes.
> 
> I feel good about almost all of the changes in the patch, except
> for the first on (the one in file classMethod.c around line 1035).
> 
> I am not sure if this would break JNI code, or if was just the wrong
> place to make the change. Could someone who knows more about the
> internals check it out and suggest a better change if there is one?
> 

Wow, I had no idea that forName accepted some of the things you found.
I hava noticed that it accepts [Ljava/lang/Object; as well as
[Ljava.lang.Object; It doesn't look like you changed this, which is
probably just as well.  As for primitive types, I suspect that keeping
them in the classpool is a bug in the first place.  Why shouldn't a
JVM accept a class who's name is a Java Language keyword?

Jason
 



large Class.forName() patch

2000-01-28 Thread Mo DeJong

Hi all.

I have been working on getting Class.forName() working more like
the Sun JDK and I have a patch that I would like to get some comments
on. I have attached a file called ArrayForName.java which tests
the Class.forName() method and compares the result to an expected
result. First, try running this test case on the current CVS version
of Kaffe to see how many failures you get. Then apply the patch
called forname.patch and check out how many problems it fixes.

I feel good about almost all of the changes in the patch, except
for the first on (the one in file classMethod.c around line 1035).

I am not sure if this would break JNI code, or if was just the wrong
place to make the change. Could someone who knows more about the
internals check it out and suggest a better change if there is one?

thanks
Mo Dejong
Red Hat Inc.


public class ArrayForName {

public static void testLoadArray() throws Exception {

// Loading by built-in type ID is not allowed
// int  != I
// boolean  != Z
// long != J
// float!= F
// double   != D
// byte != B
// short!= S
// char != C
// void != V

expect("I", "Exception");
expect("Z", "Exception");
expect("J", "Exception");
expect("F", "Exception");
expect("D", "Exception");
expect("B", "Exception");
expect("S", "Exception");
expect("C", "Exception");
expect("V", "Exception");

// Not possible to load by builtin type name

expect("int", "Exception");
expect("boolean", "Exception");
expect("long","Exception");
expect("float",   "Exception");
expect("double",  "Exception");
expect("byte","Exception");
expect("short",   "Exception");
expect("char","Exception");
expect("void","Exception");

// Test loading an array by built-in type id
// int[]== [I
// int[][]  == [[I
// boolean[]== [Z
// boolean[][]  == [[Z
// long[]   == [J
// long[][] == [[J
// float[]  == [F
// float[][]== [[F
// double[] == [D
// double[][]   == [[D
// byte[]   == [B
// byte[][] == [[B
// short[]  == [S
// short[][]== [[S
// char[]   == [C
// char[][] == [[C

expect("[I",  "int[]");
expect("[[I", "int[][]");
expect("[Z",  "boolean[]");
expect("[[Z", "boolean[][]");
expect("[J",  "long[]");
expect("[[J", "long[][]");
expect("[F",  "float[]");
expect("[[F", "float[][]");
expect("[D",  "double[]");
expect("[[D", "double[][]");
expect("[B",  "byte[]");
expect("[[B", "byte[][]");
expect("[S",  "short[]");
expect("[[S", "short[][]");
expect("[C",  "char[]");
expect("[[C", "char[][]");

// Array of type void is not allowed

expect("[V","Exception");
expect("[[V",   "Exception");
expect("[[[V",  "Exception");

// When loading an array using the built-in
// type id, id must be at end of string

expect("[II",   "Exception");
expect("[ZZ",   "Exception");
expect("[JJ",   "Exception");
expect("[FF",   "Exception");
expect("[DD",   "Exception");
expect("[BB",   "Exception");
expect("[SS",   "Exception");
expect("[CC",   "Exception");
expect("[ZZ",   "Exception");
expect("[C;",   "Exception");
expect("[C\0;", "Exception");

// [L + Class + ;
// Primitive Class name is not valid 

expect("[Lint;", "Exception");
expect("[Lboolean;", "Exception");
expect("[Llong;","Exception");
expect("[Lfloat;",   "Exception");
expect("[Ldouble;",  "Exception");
expect("[Lbyte;","Exception");
expect("[Lshort;",   "Exception");
expect("[Lchar;","Exception");
expect("[Lvoid;","Exception");

// java.lang.Object[] == [Ljava.lang.Object;
// java.lang.Object[][]   == [[Ljava.lang.Object;
// java.lang.String[] == [Ljava.lang.String;
// java.lang.String[][]   == [[Ljava.lang.String;

expect("[Ljava.lang.Object;",  "java.lang.Object[]");
expect("[[Ljava.lang.Object;", "java.lang.Object[][]");
expect("[Ljava.lang.String;",  "java.lang.String[]");
expect("[[Ljava.lang.String;", "java.lang.String[][]");

// L + Class must follow 0-N [ characters

expect("Ljava.lang.Object;", "Exception");
expect("Ljava.lang.String;", "Exception");

// Misc invalid class nam