Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-07 Thread Ian Rogers

Tom Tromey wrote:

Ian == Ian Rogers [EMAIL PROTECTED] writes:



Ian Please let me know if you think this patch is suitable for
Ian inclusion.

It looks fine.  I do have one nit, which is that we put spaces around
operators... this problem is pervasive in the patch, but here's one
example:

Ian +newStr[x-offset] = newChar;

That should read

newStr[x - offset] = newChar;

Tom
  

Thanks Tom,

here is a revised patch that if there are no other problems I'd like to 
commit today.


Regards,
Ian
Index: ChangeLog
===
RCS file: /sources/classpath/classpath/ChangeLog,v
retrieving revision 1.9481
diff -u -r1.9481 ChangeLog
--- ChangeLog   6 Feb 2008 19:00:44 -   1.9481
+++ ChangeLog   7 Feb 2008 09:21:01 -
@@ -1,3 +1,8 @@
+2008-02-07  Ian Rogers  [EMAIL PROTECTED]
+
+   * java/lang/String.java
+   Only copy live portion of String. Use array copies in preference to 
clone.
+
 2008-02-05  Ian Rogers  [EMAIL PROTECTED]
 
* gnu/java/lang/reflect/TypeSignature.java
Index: java/lang/String.java
===
RCS file: /sources/classpath/classpath/java/lang/String.java,v
retrieving revision 1.87
diff -u -r1.87 String.java
--- java/lang/String.java   23 Nov 2007 15:04:25 -  1.87
+++ java/lang/String.java   7 Feb 2008 09:21:01 -
@@ -1303,13 +1303,13 @@
 break;
 if (i  0)
   return this;
-char[] newStr = (char[]) value.clone();
-newStr[x] = newChar;
+char[] newStr = toCharArray();
+newStr[x - offset] = newChar;
 while (--i = 0)
   if (value[++x] == oldChar)
-newStr[x] = newChar;
+newStr[x - offset] = newChar;
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
   /**
@@ -1450,23 +1450,25 @@
 
 // Now we perform the conversion. Fortunately, there are no multi-character
 // lowercase expansions in Unicode 3.0.0.
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count - (x - offset));
 do
   {
 char ch = value[x];
 // Hardcoded special case.
 if (ch != '\u0049')
   {
-newStr[x++] = Character.toLowerCase(ch);
+newStr[x - offset] = Character.toLowerCase(ch);
   }
 else
   {
-newStr[x++] = '\u0131';
+newStr[x - offset] = '\u0131';
   }
+x++;
   }
 while (--i = 0);
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
   /**
@@ -1504,16 +1506,18 @@
 
 // Now we perform the conversion. Fortunately, there are no
 // multi-character lowercase expansions in Unicode 3.0.0.
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count - (x - offset));
 do
   {
 char ch = value[x];
 // Hardcoded special case.
-newStr[x++] = Character.toLowerCase(ch);
+newStr[x - offset] = Character.toLowerCase(ch);
+x++;
   }
 while (--i = 0);
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
  }
   }
 
@@ -1557,22 +1561,24 @@
 i = count;
 if (expand == 0)
   {
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count - (x - offset));
 while (--i = 0)
   {
 char ch = value[x];
 // Hardcoded special case.
 if (ch != '\u0069')
   {
-newStr[x++] = Character.toUpperCase(ch);
+newStr[x - offset] = Character.toUpperCase(ch);
   }
 else
   {
-newStr[x++] = '\u0130';
+newStr[x - offset] = '\u0130';
   }
+x++;
   }
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
 // Expansion is necessary.
@@ -1642,14 +1648,16 @@
 i = count;
 if (expand == 0)
   {
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count - (x - offset));
 while (--i = 0)
   {
 char ch = value[x];
-newStr[x++] = Character.toUpperCase(ch);
+newStr[x - offset] = Character.toUpperCase(ch);
+ 

Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-07 Thread Tom Tromey
Ian +  * java/lang/String.java
Ian +  Only copy live portion of String. Use array copies in preference to 
clone.

The ChangeLog entry should mention method names.
See other examples in the file, or the GNU coding standards.

Otherwise -- looks good to me, thanks.

Tom



Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-05 Thread Ian Rogers

Robert Lougher wrote:

Hi,

On 2/4/08, Ian Rogers [EMAIL PROTECTED] wrote:
  

Hi,

xalan performs 1.4 million char array clones per iteration of the normal
size DaCapo benchmark. All of the character array clones are coming from
java.lang.String. The attached patch changes the use of char[].clone
(which maps to java.lang.Object.clone) to a helper method that allocates
the character array and then array copies from the source to the new
array. On the Jikes RVM I get the following performance results from 10
iterations of DaCapo using the large data set size:

current java.lang.String using char[].clone:
run 1: 99157ms
run 2: 98700ms
run 3: 97927ms

patched java.lang.String using the helper routine:
run 1: 97710ms
run 2: 97406ms
run 3: 96762ms

The speed up is between 0.22% and 1.2%. Do people think using the helper
is a sensible change?




I would like to see evidence that this is a win, or at least has no
slowdown on other VMs (i.e. it is VM independent).  I think it would
be inappropriate if it was only to address implementation issues in
JikesRVM.  For example, why is the helper faster than clone?  Surely
all clone() should be doing is an alloc and then an arraycopy?

Rob.
  


Hi Rob, Twisti, Tromey,

so what happens in the case of the clone is:

1) call into clone
2) determine that this is a clone of an array
3) create array of same length as that which we're cloning (we inline as 
far as here in the case of Jikes RVM)

4) call into array copy
5) determine type of array we're copying
6) check for overlaps
7) copy arrays
8) leave array copy and clone
9) check that the resulting array casts back to a char[]

in the case of the helper method:

1) call into helper method
2) create array of same length as that which we're cloning
3) call into array copy
4) determine type of array we're copying
5) check for overlaps
6) copy arrays (we inline as far as here in the case of Jikes RVM)
7) leave array copy and helper method

The Jikes RVM is performing a lot of partial evaluation to determine 
that a lot of bounds checks, casts, instanceof tests are unnecessary and 
the result is code that just allocates the array and performs a copy 
without checks. In the case of clone our partial evaluation breaks down 
because we need the results of runtime reflection calls or to know that 
these calls are analogous to bytecodes when the arguments are constant. 
I plan to do optimizations in this direction, but I don't want to 
flatter the optimizations when they probably only effect a small number 
of situations and alternate view is that code may have been written in a 
slightly esoteric manner.


I think the Jikes RVM is performing more optimizations than other 
Classpath VMs, so its likely the performance win will be less marked for 
those VMs (if there's any win at all). I think Tromey's point is valid 
and I'll try to write a better patch to address it. Once I have posted 
the new patch maybe we can return to the question as to whether to make 
the change.


Thanks for all the feedback!
Ian



Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-05 Thread Ian Rogers

Hello all,

here's a revised patch that removes the use of clone in preference to 
just array copying the live portion of the String. Here are the DaCapo 
xalan figures:


run 1: 97972ms
run 2: 97837ms
run 3: 95290ms

which represents anything from a 0.04% slow down to a 2.8% speed up. 
There will also be a space saving as we're not cloning whole arrays any 
more. Please let me know if you think this patch is suitable for inclusion.


Thanks,
Ian
--- java/lang/String.java   2008-01-23 09:01:02.0 +
+++ java/lang/String.java   2008-02-05 10:07:56.0 +
@@ -1303,13 +1303,13 @@
 break;
 if (i  0)
   return this;
-char[] newStr = (char[]) value.clone();
-newStr[x] = newChar;
+char[] newStr = toCharArray();
+newStr[x-offset] = newChar;
 while (--i = 0)
   if (value[++x] == oldChar)
-newStr[x] = newChar;
+newStr[x-offset] = newChar;
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
   /**
@@ -1450,23 +1450,25 @@
 
 // Now we perform the conversion. Fortunately, there are no multi-character
 // lowercase expansions in Unicode 3.0.0.
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count-(x-offset));
 do
   {
 char ch = value[x];
 // Hardcoded special case.
 if (ch != '\u0049')
   {
-newStr[x++] = Character.toLowerCase(ch);
+newStr[x-offset] = Character.toLowerCase(ch);
   }
 else
   {
-newStr[x++] = '\u0131';
+newStr[x-offset] = '\u0131';
   }
+x++;
   }
 while (--i = 0);
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
   /**
@@ -1504,16 +1506,18 @@
 
 // Now we perform the conversion. Fortunately, there are no
 // multi-character lowercase expansions in Unicode 3.0.0.
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count-(x-offset));
 do
   {
 char ch = value[x];
 // Hardcoded special case.
-newStr[x++] = Character.toLowerCase(ch);
+newStr[x-offset] = Character.toLowerCase(ch);
+x++;
   }
 while (--i = 0);
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
  }
   }
 
@@ -1557,22 +1561,24 @@
 i = count;
 if (expand == 0)
   {
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count-(x-offset));
 while (--i = 0)
   {
 char ch = value[x];
 // Hardcoded special case.
 if (ch != '\u0069')
   {
-newStr[x++] = Character.toUpperCase(ch);
+newStr[x-offset] = Character.toUpperCase(ch);
   }
 else
   {
-newStr[x++] = '\u0130';
+newStr[x-offset] = '\u0130';
   }
+x++;
   }
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
 // Expansion is necessary.
@@ -1642,14 +1648,16 @@
 i = count;
 if (expand == 0)
   {
-char[] newStr = (char[]) value.clone();
+char[] newStr = new char[count];
+VMSystem.arraycopy(value, offset, newStr, 0, count-(x-offset));
 while (--i = 0)
   {
 char ch = value[x];
-newStr[x++] = Character.toUpperCase(ch);
+newStr[x-offset] = Character.toUpperCase(ch);
+x++;
   }
 // Package constructor avoids an array copy.
-return new String(newStr, offset, count, true);
+return new String(newStr, 0, count, true);
   }
 
 // Expansion is necessary.
@@ -1730,9 +1738,6 @@
*/
   public char[] toCharArray()
   {
-if (count == value.length)
-  return (char[]) value.clone();
-
 char[] copy = new char[count];
 VMSystem.arraycopy(value, offset, copy, 0, count);
 return copy;


Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-05 Thread Tom Tromey
 Ian == Ian Rogers [EMAIL PROTECTED] writes:

Ian Please let me know if you think this patch is suitable for
Ian inclusion.

It looks fine.  I do have one nit, which is that we put spaces around
operators... this problem is pervasive in the patch, but here's one
example:

Ian +newStr[x-offset] = newChar;

That should read

newStr[x - offset] = newChar;

Tom



Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-04 Thread Robert Lougher
Hi,

On 2/4/08, Ian Rogers [EMAIL PROTECTED] wrote:
 Hi,

 xalan performs 1.4 million char array clones per iteration of the normal
 size DaCapo benchmark. All of the character array clones are coming from
 java.lang.String. The attached patch changes the use of char[].clone
 (which maps to java.lang.Object.clone) to a helper method that allocates
 the character array and then array copies from the source to the new
 array. On the Jikes RVM I get the following performance results from 10
 iterations of DaCapo using the large data set size:

 current java.lang.String using char[].clone:
 run 1: 99157ms
 run 2: 98700ms
 run 3: 97927ms

 patched java.lang.String using the helper routine:
 run 1: 97710ms
 run 2: 97406ms
 run 3: 96762ms

 The speed up is between 0.22% and 1.2%. Do people think using the helper
 is a sensible change?


I would like to see evidence that this is a win, or at least has no
slowdown on other VMs (i.e. it is VM independent).  I think it would
be inappropriate if it was only to address implementation issues in
JikesRVM.  For example, why is the helper faster than clone?  Surely
all clone() should be doing is an alloc and then an arraycopy?

Rob.

 Thanks,
 Ian

 --- java/lang/String.java   2008-01-23 09:01:02.0 +
 +++ java/lang/String.java   2008-02-04 13:43:02.0 +
 @@ -1284,6 +1284,13 @@
 return new String(newStr, 0, newStr.length, true);
   }

 +  private static char[] cloneCharArray(char[] src)
 +  {
 +char[] copy = new char[src.length];
 +VMSystem.arraycopy(src, 0, copy, 0, src.length);
 +return copy;
 +  }
 +
   /**
* Replaces every instance of a character in this String with a new
* character. If no replacements occur, this is returned.
 @@ -1303,7 +1310,7 @@
 break;
 if (i  0)
   return this;
 -char[] newStr = (char[]) value.clone();
 +char[] newStr = cloneCharArray(value);
 newStr[x] = newChar;
 while (--i = 0)
   if (value[++x] == oldChar)
 @@ -1450,7 +1457,7 @@

 // Now we perform the conversion. Fortunately, there are no 
 multi-character
 // lowercase expansions in Unicode 3.0.0.
 -char[] newStr = (char[]) value.clone();
 +char[] newStr = cloneCharArray(value);
 do
   {
 char ch = value[x];
 @@ -1504,7 +1511,7 @@

 // Now we perform the conversion. Fortunately, there are no
 // multi-character lowercase expansions in Unicode 3.0.0.
 -char[] newStr = (char[]) value.clone();
 +char[] newStr = cloneCharArray(value);
 do
   {
 char ch = value[x];
 @@ -1557,7 +1564,7 @@
 i = count;
 if (expand == 0)
   {
 -char[] newStr = (char[]) value.clone();
 +char[] newStr = cloneCharArray(value);
 while (--i = 0)
   {
 char ch = value[x];
 @@ -1642,7 +1649,7 @@
 i = count;
 if (expand == 0)
   {
 -char[] newStr = (char[]) value.clone();
 +char[] newStr = cloneCharArray(value);
 while (--i = 0)
   {
 char ch = value[x];
 @@ -1731,7 +1738,7 @@
   public char[] toCharArray()
   {
 if (count == value.length)
 -  return (char[]) value.clone();
 +  return cloneCharArray(value);

 char[] copy = new char[count];
 VMSystem.arraycopy(value, offset, copy, 0, count);





Re: [cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-04 Thread Tom Tromey
 Ian == Ian Rogers [EMAIL PROTECTED] writes:

Ian +  private static char[] cloneCharArray(char[] src)
Ian +  {
Ian +char[] copy = new char[src.length];
Ian +VMSystem.arraycopy(src, 0, copy, 0, src.length);
Ian +return copy;  
Ian +  }

I think it would be better in these places to copy the live subset
of the char[] -- if 'this' String is a slice of some other array, we
don't want to copy the entire array, but instead only the part that we
use.  This will use less memory.

Unfortunately this change would mean updating the logic of the
surrounding functions, since they generally use offsets into the
original array.

Tom



[cp-patches] RFC: use helper method to clone char array in java.lang.String

2008-02-04 Thread Ian Rogers

Hi,

xalan performs 1.4 million char array clones per iteration of the normal 
size DaCapo benchmark. All of the character array clones are coming from 
java.lang.String. The attached patch changes the use of char[].clone 
(which maps to java.lang.Object.clone) to a helper method that allocates 
the character array and then array copies from the source to the new 
array. On the Jikes RVM I get the following performance results from 10 
iterations of DaCapo using the large data set size:


current java.lang.String using char[].clone:
run 1: 99157ms
run 2: 98700ms
run 3: 97927ms

patched java.lang.String using the helper routine:
run 1: 97710ms
run 2: 97406ms
run 3: 96762ms

The speed up is between 0.22% and 1.2%. Do people think using the helper 
is a sensible change?


Thanks,
Ian
--- java/lang/String.java   2008-01-23 09:01:02.0 +
+++ java/lang/String.java   2008-02-04 13:43:02.0 +
@@ -1284,6 +1284,13 @@
 return new String(newStr, 0, newStr.length, true);
   }
 
+  private static char[] cloneCharArray(char[] src)
+  {
+char[] copy = new char[src.length];
+VMSystem.arraycopy(src, 0, copy, 0, src.length);
+return copy;  
+  }
+
   /**
* Replaces every instance of a character in this String with a new
* character. If no replacements occur, this is returned.
@@ -1303,7 +1310,7 @@
 break;
 if (i  0)
   return this;
-char[] newStr = (char[]) value.clone();
+char[] newStr = cloneCharArray(value);
 newStr[x] = newChar;
 while (--i = 0)
   if (value[++x] == oldChar)
@@ -1450,7 +1457,7 @@
 
 // Now we perform the conversion. Fortunately, there are no multi-character
 // lowercase expansions in Unicode 3.0.0.
-char[] newStr = (char[]) value.clone();
+char[] newStr = cloneCharArray(value);
 do
   {
 char ch = value[x];
@@ -1504,7 +1511,7 @@
 
 // Now we perform the conversion. Fortunately, there are no
 // multi-character lowercase expansions in Unicode 3.0.0.
-char[] newStr = (char[]) value.clone();
+char[] newStr = cloneCharArray(value);
 do
   {
 char ch = value[x];
@@ -1557,7 +1564,7 @@
 i = count;
 if (expand == 0)
   {
-char[] newStr = (char[]) value.clone();
+char[] newStr = cloneCharArray(value);
 while (--i = 0)
   {
 char ch = value[x];
@@ -1642,7 +1649,7 @@
 i = count;
 if (expand == 0)
   {
-char[] newStr = (char[]) value.clone();
+char[] newStr = cloneCharArray(value);
 while (--i = 0)
   {
 char ch = value[x];
@@ -1731,7 +1738,7 @@
   public char[] toCharArray()
   {
 if (count == value.length)
-  return (char[]) value.clone();
+  return cloneCharArray(value);
 
 char[] copy = new char[count];
 VMSystem.arraycopy(value, offset, copy, 0, count);