Robert Lougher wrote:
Hi,
On 2/4/08, Ian Rogers <[EMAIL PROTECTED]> wrote:
Hi,
xalan performs 1.4 million char array clones per iteration of the normal
size DaCapo benchmark. All of the character array clones are coming from
java.lang.String. The attached patch changes the use of char[].clone
(which maps to java.lang.Object.clone) to a helper method that allocates
the character array and then array copies from the source to the new
array. On the Jikes RVM I get the following performance results from 10
iterations of DaCapo using the large data set size:
current java.lang.String using char[].clone:
run 1: 99157ms
run 2: 98700ms
run 3: 97927ms
patched java.lang.String using the helper routine:
run 1: 97710ms
run 2: 97406ms
run 3: 96762ms
The speed up is between 0.22% and 1.2%. Do people think using the helper
is a sensible change?
I would like to see evidence that this is a win, or at least has no
slowdown on other VMs (i.e. it is VM independent). I think it would
be inappropriate if it was only to address implementation issues in
JikesRVM. For example, why is the helper faster than clone? Surely
all clone() should be doing is an alloc and then an arraycopy?
Rob.
Hi Rob, Twisti, Tromey,
so what happens in the case of the clone is:
1) call into clone
2) determine that this is a clone of an array
3) create array of same length as that which we're cloning (we inline as
far as here in the case of Jikes RVM)
4) call into array copy
5) determine type of array we're copying
6) check for overlaps
7) copy arrays
8) leave array copy and clone
9) check that the resulting array casts back to a char[]
in the case of the helper method:
1) call into helper method
2) create array of same length as that which we're cloning
3) call into array copy
4) determine type of array we're copying
5) check for overlaps
6) copy arrays (we inline as far as here in the case of Jikes RVM)
7) leave array copy and helper method
The Jikes RVM is performing a lot of partial evaluation to determine
that a lot of bounds checks, casts, instanceof tests are unnecessary and
the result is code that just allocates the array and performs a copy
without checks. In the case of clone our partial evaluation breaks down
because we need the results of runtime reflection calls or to know that
these calls are analogous to bytecodes when the arguments are constant.
I plan to do optimizations in this direction, but I don't want to
flatter the optimizations when they probably only effect a small number
of situations and alternate view is that code may have been written in a
slightly esoteric manner.
I think the Jikes RVM is performing more optimizations than other
Classpath VMs, so its likely the performance win will be less marked for
those VMs (if there's any win at all). I think Tromey's point is valid
and I'll try to write a better patch to address it. Once I have posted
the new patch maybe we can return to the question as to whether to make
the change.
Thanks for all the feedback!
Ian