Finally some follow up on this!
On Feb 13 2013, at 23:43 , Martin Buchholz wrote:
> I like the optimizations in this change, especially the avoidance of copies.
> Was there some good reason the jdk never before used JavaLangAccess to avoid
> String creation copies?
Not that I am aware of. I did work on String and thereafter the idea occurred
to me. Copying...we hates it.
> I am tempted to micro-optimize this some more, e.g. specialize the hex
> printing code. I might get rid of the digits table and compute ASCII
> characters simply:
>
> ((i < 10) ? '0' : ('a' - 10)) + i
I wanted to play with JMH so used this as an excuse.
It turns out that using a calculation is a performance loser:
Benchmark Thr Cnt Sec Mean Mean
error Var Units
o.o.j.s.g.t.DigitsTableVsCalc.digitsCalc 1 8 1 8463376.310
2516.169 4136965.190 ops/sec
o.o.j.s.g.t.DigitsTableVsCalc.digitsTable 1 8 1 11838194.938
4936.538 15923812.479 ops/sec
The source of the benchmarks:
@OutputTimeUnit(TimeUnit.SECONDS)
@GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
public void digitsTable() throws InterruptedException {
int end = testData.length;
for (int each = 0; each < end; each++) {
int i = testData[each];
output[each] = digits[i];
}
}
@OutputTimeUnit(TimeUnit.SECONDS)
@GenerateMicroBenchmark(BenchmarkType.OpsPerTimeUnit)
public void digitsCalc() throws InterruptedException {
int end = testData.length;
for (int each = 0; each < end; each++) {
int i = testData[each];
int base = i < 10 ? 48 : 87;
output[each] = (char) (base + i);
}
}
Where testData is a 100 element static int array of random digit values (0-35).
> Why not formatUnsignedInt?
Now added and it seems to help.
> Why not make the new format methods public?
They seemed too specialized to me, particularly with the constraints on radix.
>
> Is there a better name than createStringSharedChars? We hope those chars
> will *not* be shared! How about newStringUnsafe(char[] chars)
>
> The spec for
> + int formatUnsignedLong(long val, int shift, char[] buf, int offset, int
> len);
> should make it clearer whose responsibility getting the size right is. It
> looks like the caller has to ensure there will be enough space, so perhaps
> the caller should just pass one offset instead of offset plus len.
Seems reasonable.
> + * @return the lowest character location used
> stray space
Corrected.
>
>
> On Wed, Feb 13, 2013 at 2:45 PM, Mike Duigou <[email protected]> wrote:
> I have updated the patch with some of Ulf's feedback and corrected one
> cut-and-paste error that I made.
>
> The updated webrev is at:
>
> http://cr.openjdk.java.net/~mduigou/JDK-8006627/2/webrev/
>
> Mike
>
> On Feb 12 2013, at 18:25 , Ulf Zibis wrote:
>
> > Am 13.02.2013 02:34, schrieb Mike Duigou:
> >> Thank you for the comments Ulf.
> >>
> >> On Feb 12 2013, at 17:24 , Ulf Zibis wrote:
> >>
> >>> Am 13.02.2013 00:30, schrieb Mike Duigou:
> >>>> Hi Steven;
> >>>>
> >>>> I have updated the patch for Java 8. There's somewhat less code sharing
> >>>> and a bit of refactoring than your last version but the performance
> >>>> should be about the same or a little better.
> >>>>
> >>>> http://cr.openjdk.java.net/~mduigou/JDK-8007398/0/webrev/
> >>> Couldn't you use String(buf, true) for all to(Unsigned)String(...)
> >>> methods ?
> >> Possibly. I didn't go looking too far with looking for additional
> >> improvements.
> >>
> >>> Instead of calculating the mask each time, you could use:
> >>> 309 private static String toUnsignedString(int i, int shift, int
> >>> mask) {
> >> I think that would actually be inefficient. I haven't looked at the JITed
> >> code but the mask calculation is pretty cheap relative to parameter
> >> passing.
> >
> > I believe, JIT will inline the code, so there would be no parameter passing.
> >
> > Additionally the calculation of char count could be faster, if you would
> > have short cuts like:
> > numberOfLeading2Zeros(i)
> > numberOfLeading4Zeros(i)
> > numberOfLeading8Zeros(i)
> > ...
> > So the optimum would be with separate toString methods:
> > String toBase2String(int i);
> > String toBase4String(int i);
> > String toBase8String(int i);
> > ...
> >
> > In any case I would save 2 lines:
> >
> > 311 int mag = Integer.SIZE - Long.numberOfLeadingZeros(i);
> > 312 int charCount = Math.max(((mag + (shift - 1)) / shift), 1);
> > 313 char[] buf = new char[charCount];
> > 316 int mask = (1 << shift) - 1;
> >
> > -Ulf
> >
>
>