Hello,
currently jdk.internal.misc.Unsafe declares method
allocateUninitializedArray(Class, int) returning uninitialized array of given
type and length.
Allocation of uninitialized arrays is faster especially for larger ones, so we
could use them for cases
when we are sure that created array will be completely overwritten or is
guarded by count field (e.g. in AbstractStringBuilder).
I've exposed jdk.internal.misc.Unsafe.allocateUninitializedArray(Class, int)
via delegating method of sun.misc.Unsafe to measure creation of byte[] with
benchmark [1]
and got those results:
(length) Mode Cnt Score Error Units
constructor 0 avgt 50 7.639 ± 0.629 ns/op
constructor 10 avgt 50 9.448 ± 0.725 ns/op
constructor 100 avgt 50 21.158 ± 1.865 ns/op
constructor 1000 avgt 50 146.158 ± 9.836 ns/op
constructor 10000 avgt 50 916.321 ± 50.618 ns/op
unsafe 0 avgt 50 8.057 ± 0.599 ns/op
unsafe 10 avgt 50 8.308 ± 0.907 ns/op
unsafe 100 avgt 50 12.232 ± 1.813 ns/op
unsafe 1000 avgt 50 37.679 ± 1.382 ns/op
unsafe 10000 avgt 50 78.896 ± 6.758 ns/op
As a result I propose to add the following methods into StringConcatHelper
@ForceInline
static byte[] newArray(int length) {
return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length);
}
@ForceInline
static char[] newCharArray(int length) {
return (char[]) UNSAFE.allocateUninitializedArray(char.class, length);
}
along with existing StringConcatHelper.newArray(long indexCoder) and utilize
them in String-related operations
instead of conventional array creation with new-keyword.
I've used benchmark [2] to measure impact on affected String-methods and found
quite a good improvement:
before after
Benchmark (length) Score Error Score Error
Units
newStringBuilderWithLength 8 8.288 ± 0.411 5.656 ± 0.019
ns/op
newStringBuilderWithLength 64 12.954 ± 0.687 7.588 ± 0.009
ns/op
newStringBuilderWithLength 128 20.603 ± 0.412 10.446 ± 0.005
ns/op
newStringBuilderWithLength 1024 119.935 ± 2.383 35.452 ± 0.029
ns/op
newStringBuilderWithString 8 19.721 ± 0.375 14.642 ± 0.052
ns/op
newStringBuilderWithString 64 34.006 ± 1.523 15.479 ± 0.031
ns/op
newStringBuilderWithString 128 36.697 ± 0.972 17.052 ± 0.133
ns/op
newStringBuilderWithString 1024 140.486 ± 6.396 85.156 ± 0.175
ns/op
repeatOneByteString 8 11.340 ± 0.197 9.736 ± 0.051
ns/op
repeatOneByteString 64 20.859 ± 0.257 15.073 ± 0.024
ns/op
repeatOneByteString 128 36.311 ± 1.162 22.808 ± 0.198
ns/op
repeatOneByteString 1024 149.243 ± 3.083 82.839 ± 0.193
ns/op
repeatOneChar 8 28.033 ± 0.615 20.377 ± 0.137
ns/op
repeatOneChar 64 56.399 ± 1.094 36.230 ± 0.051
ns/op
repeatOneChar 128 68.423 ± 5.647 44.157 ± 0.239
ns/op
repeatOneChar 1024 230.115 ± 0.312 179.126 ± 0.437
ns/op
replace 8 14.684 ± 0.088 14.434 ± 0.057
ns/op
replace 64 56.811 ± 0.612 56.420 ± 0.050
ns/op
replace 128 112.694 ± 0.404 109.799 ± 1.202
ns/op
replace 1024 837.939 ± 0.855 818.802 ± 0.408
ns/op
replaceUtf 8 17.802 ± 0.074 15.744 ± 0.094
ns/op
replaceUtf 64 45.754 ± 0.139 39.228 ± 0.864
ns/op
replaceUtf 128 67.170 ± 0.353 50.497 ± 1.218
ns/op
replaceUtf 1024 415.767 ± 6.829 297.831 ± 22.510
ns/op
toCharArray 8 6.164 ± 0.033 6.128 ± 0.064
ns/op
toCharArray 64 10.960 ± 0.032 13.566 ± 0.802
ns/op
toCharArray 128 20.373 ± 0.013 20.823 ± 0.376
ns/op
toCharArray 1024 165.923 ± 0.098 164.362 ± 0.065
ns/op
toCharArrayUTF 8 8.009 ± 0.067 7.778 ± 0.026
ns/op
toCharArrayUTF 64 11.112 ± 0.014 10.880 ± 0.010
ns/op
toCharArrayUTF 128 20.390 ± 0.014 20.103 ± 0.017
ns/op
toCharArrayUTF 1024 166.233 ± 0.076 163.827 ± 0.099
ns/op
So the question is could we include the changes of attached patch into JDK?
With best regards,
Sergey Tsypanov
1. Benchmark for array-allocation
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class ArrayAllocationBenchmark {
private static Unsafe U;
static {
try {
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
U = (Unsafe) f.get(null);
} catch (Exception e) {
new RuntimeException(e);
}
}
@Benchmark
public byte[] constructor(Data data) {
return new byte[data.length];
}
@Benchmark
public byte[] unsafe(Data data) {
return (byte[]) U.allocateUninitializedArray(byte.class, data.length);
}
@State(Scope.Thread)
public static class Data {
@Param({"0", "10", "100", "1000", "10000"})
private int length;
}
}
2. Benchmark for String-method measurements
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class MiscStringBenchmark {
@Benchmark
public char[] toCharArrayLatin1(Data data) {
return data.string.toCharArray();
}
@Benchmark
public char[] toCharArrayUTF(Data data) {
return data.utfString.toCharArray();
}
@Benchmark
public String repeatOneByteString(Data data) {
return data.oneCharString.repeat(data.length);
}
@Benchmark
public String repeatOneChar(Data data) {
return data.oneUtfCharString.repeat(data.length);
}
@Benchmark
public String replace(Data data){
return data.stringToReplace.replace('z', 'b');
}
@Benchmark
public String replaceUtf(Data data){
return data.utfStringToReplace.replace('я', 'ю');
}
@Benchmark
public StringBuilder newStringBuilderWithLength(Data data) {
return new StringBuilder(data.length);
}
@Benchmark
public StringBuilder newStringBuilderWithString(Data data) {
return new StringBuilder(data.string);
}
@State(Scope.Thread)
public static class Data {
@Param({"8", "64", "128", "1024"})
private int length;
private String string;
public String utfString;
private final String oneCharString = "a";
private final String oneUtfCharString = "ё";
private String stringToReplace;
private String utfStringToReplace;
@Setup
public void setup() {
string = oneCharString.repeat(length);
utfString = oneUtfCharString.repeat(length);
stringToReplace = string + 'z';
utfStringToReplace = utfString + 'я';
}
}
}
diff --git a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
--- a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
+++ b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
@@ -85,7 +85,7 @@
*/
AbstractStringBuilder(int capacity) {
if (COMPACT_STRINGS) {
- value = new byte[capacity];
+ value = StringConcatHelper.newArray(capacity);
coder = LATIN1;
} else {
value = StringUTF16.newBytesFor(capacity);
@@ -108,7 +108,7 @@
final byte initCoder = str.coder();
coder = initCoder;
value = (initCoder == LATIN1)
- ? new byte[capacity] : StringUTF16.newBytesFor(capacity);
+ ? StringConcatHelper.newArray(capacity) : StringUTF16.newBytesFor(capacity);
append(str);
}
@@ -143,7 +143,7 @@
coder = initCoder;
value = (initCoder == LATIN1)
- ? new byte[capacity] : StringUTF16.newBytesFor(capacity);
+ ? StringConcatHelper.newArray(capacity) : StringUTF16.newBytesFor(capacity);
append(seq);
}
diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java
--- a/src/java.base/share/classes/java/lang/String.java
+++ b/src/java.base/share/classes/java/lang/String.java
@@ -3578,12 +3578,12 @@
throw new OutOfMemoryError("Required length exceeds implementation limit");
}
if (len == 1) {
- final byte[] single = new byte[count];
+ final byte[] single = StringConcatHelper.newArray(count);
Arrays.fill(single, value[0]);
return new String(single, coder);
}
final int limit = len * count;
- final byte[] multiple = new byte[limit];
+ final byte[] multiple = StringConcatHelper.newArray(limit);
System.arraycopy(value, 0, multiple, 0, len);
int copied = len;
for (; copied < limit - copied; copied <<= 1) {
diff --git a/src/java.base/share/classes/java/lang/StringConcatHelper.java b/src/java.base/share/classes/java/lang/StringConcatHelper.java
--- a/src/java.base/share/classes/java/lang/StringConcatHelper.java
+++ b/src/java.base/share/classes/java/lang/StringConcatHelper.java
@@ -491,7 +491,28 @@
static byte[] newArray(long indexCoder) {
byte coder = (byte)(indexCoder >> 32);
int index = (int)indexCoder;
- return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, index << coder);
+ int length = index << coder;
+ return newArray(length);
+ }
+
+ /**
+ * Allocates an uninitialized byte array
+ * @param length
+ * @return the newly allocated byte array
+ */
+ @ForceInline
+ static byte[] newArray(int length) {
+ return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length);
+ }
+
+ /**
+ * Allocates an uninitialized char array
+ * @param length
+ * @return the newly allocated char array
+ */
+ @ForceInline
+ static char[] newCharArray(int length) {
+ return (char[]) UNSAFE.allocateUninitializedArray(char.class, length);
}
/**
diff --git a/src/java.base/share/classes/java/lang/StringLatin1.java b/src/java.base/share/classes/java/lang/StringLatin1.java
--- a/src/java.base/share/classes/java/lang/StringLatin1.java
+++ b/src/java.base/share/classes/java/lang/StringLatin1.java
@@ -27,11 +27,9 @@
import java.util.Arrays;
import java.util.Locale;
-import java.util.Objects;
import java.util.Spliterator;
import java.util.function.Consumer;
import java.util.function.IntConsumer;
-import java.util.stream.IntStream;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
import jdk.internal.HotSpotIntrinsicCandidate;
@@ -71,7 +69,7 @@
}
public static char[] toChars(byte[] value) {
- char[] dst = new char[value.length];
+ char[] dst = StringConcatHelper.newCharArray(value.length);
inflate(value, 0, dst, 0, value.length);
return dst;
}
@@ -742,7 +740,7 @@
}
public static byte[] toBytes(int[] val, int off, int len) {
- byte[] ret = new byte[len];
+ byte[] ret = StringConcatHelper.newArray(len);
for (int i = 0; i < len; i++) {
int cp = val[off++];
if (!canEncode(cp)) {
diff --git a/src/java.base/share/classes/java/lang/StringUTF16.java b/src/java.base/share/classes/java/lang/StringUTF16.java
--- a/src/java.base/share/classes/java/lang/StringUTF16.java
+++ b/src/java.base/share/classes/java/lang/StringUTF16.java
@@ -34,8 +34,6 @@
import java.util.stream.StreamSupport;
import jdk.internal.HotSpotIntrinsicCandidate;
import jdk.internal.util.ArraysSupport;
-import jdk.internal.vm.annotation.ForceInline;
-import jdk.internal.vm.annotation.DontInline;
import static java.lang.String.UTF16;
import static java.lang.String.LATIN1;
@@ -50,7 +48,7 @@
throw new OutOfMemoryError("UTF16 String size is " + len +
", should be less than " + MAX_LENGTH);
}
- return new byte[len << 1];
+ return StringConcatHelper.newArray(len << 1);
}
@HotSpotIntrinsicCandidate
@@ -142,7 +140,7 @@
}
public static char[] toChars(byte[] value) {
- char[] dst = new char[value.length >> 1];
+ char[] dst = StringConcatHelper.newCharArray(value.length >> 1);
getChars(value, 0, dst.length, dst, 0);
return dst;
}
@@ -642,7 +640,7 @@
}
}
if (i < len) {
- byte[] buf = new byte[value.length];
+ byte[] buf = StringConcatHelper.newArray(value.length);
for (int j = 0; j < i; j++) {
putChar(buf, j, getChar(value, j)); // TBD:arraycopy?
}
@@ -813,7 +811,7 @@
}
if (first == len)
return str;
- byte[] result = new byte[value.length];
+ byte[] result = StringConcatHelper.newArray(value.length);
System.arraycopy(value, 0, result, 0, first << 1); // Just copy the first few
// lowerCase characters.
String lang = locale.getLanguage();
@@ -918,7 +916,7 @@
if (first == len) {
return str;
}
- byte[] result = new byte[value.length];
+ byte[] result = StringConcatHelper.newArray(value.length);
System.arraycopy(value, 0, result, 0, first << 1); // Just copy the first few
// upperCase characters.
String lang = locale.getLanguage();
diff --git a/src/java.base/share/classes/jdk/internal/misc/Unsafe.java b/src/java.base/share/classes/jdk/internal/misc/Unsafe.java
--- a/src/java.base/share/classes/jdk/internal/misc/Unsafe.java
+++ b/src/java.base/share/classes/jdk/internal/misc/Unsafe.java
@@ -1397,7 +1397,7 @@
throw new IllegalArgumentException("Component type is not primitive");
}
if (length < 0) {
- throw new IllegalArgumentException("Negative length");
+ throw new NegativeArraySizeException(String.valueOf(length));
}
return allocateUninitializedArray0(componentType, length);
}