On Wed, 9 Mar 2022 08:33:36 GMT, Xin Liu <x...@openjdk.org> wrote: >> If AbstractStringBuilder only grow, the inflated value which has been >> encoded in UTF16 can't be compressed. >> toString() can skip compression in this case. This can save an >> ArrayAllocation in StringUTF16::compress(). >> >> java.io.BufferedRead::readLine() is a case that StringBuilder grows only. >> >> In microbench, we expect to see that allocation/op reduces 20%. The initial >> capacity of StringBuilder is S in bytes. When it encounters the 1st >> character that can't be encoded in LATIN1, it inflates and allocate a new >> array of 2*S. `toString()` will try to compress that value so it need to >> allocate S bytes. The last step allocates 2*S bytes because it has to copy >> the string. so it requires to allocate 5 * S bytes in total. By skipping >> the failed compression, it only allocates 4 * S bytes. that's 20%. In real >> execution, we observe 16% allocation reduction, tracked by JMH GC profiler >> `gc.alloc.rate.norm `. I think it's because HotSpot can't track all >> allocations. >> >> Not only allocation drops, the runtime performance(ns/op) also increases >> from 3.34% to 18.91%. >> >> Before: >> >> $$make test >> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars" >> MICRO="OPTIONS=-prof gc -gc true -o before.log -jvm >> $HOME/Devel/jdk_baseline/bin/java" >> >> Benchmark >> (MIXED_SIZE) Mode Cnt Score Error Units >> StringBuilders.toStringWithMixedChars >> 128 avgt 15 649.846 ± 76.291 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 128 avgt 15 872.855 ± 128.259 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 128 avgt 15 880.121 ± 0.050 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 128 avgt 15 707.730 ± 194.421 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 128 avgt 15 706.602 ± 94.504 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 128 avgt 15 0.001 ± 0.002 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 128 avgt 15 0.001 ± 0.001 B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 128 avgt 15 113.000 counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 128 avgt 15 85.000 ms >> StringBuilders.toStringWithMixedChars >> 256 avgt 15 1316.652 ± 112.771 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 256 avgt 15 800.864 ± 76.869 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 256 avgt 15 1648.288 ± 0.162 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 256 avgt 15 599.736 ± 174.001 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 256 avgt 15 1229.669 ± 318.518 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 256 avgt 15 0.001 ± 0.001 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 256 avgt 15 0.001 ± 0.002 B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 256 avgt 15 133.000 counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 256 avgt 15 92.000 ms >> StringBuilders.toStringWithMixedChars >> 1024 avgt 15 5204.303 ± 418.115 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 1024 avgt 15 768.730 ± 72.945 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 1024 avgt 15 6256.844 ± 0.358 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 1024 avgt 15 655.852 ± 121.602 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 1024 avgt 15 5315.265 ± 578.878 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 1024 avgt 15 0.002 ± 0.002 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 1024 avgt 15 0.014 ± 0.011 B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 1024 avgt 15 96.000 counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 1024 avgt 15 86.000 ms >> >> >> After >> >> $make test >> TEST="micro:org.openjdk.bench.java.lang.StringBuilders.toStringWithMixedChars" >> MICRO="OPTIONS=-prof gc -gc true -o after.log" >> >> Benchmark >> (MIXED_SIZE) Mode Cnt Score Error Units >> StringBuilders.toStringWithMixedChars >> 128 avgt 15 627.522 ± 54.804 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 128 avgt 15 751.353 ± 83.478 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 128 avgt 15 736.120 ± 0.062 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 128 avgt 15 589.102 ± 164.083 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 128 avgt 15 575.425 ± 127.021 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 128 avgt 15 ≈ 10⁻³ MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 128 avgt 15 ≈ 10⁻³ B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 128 avgt 15 116.000 counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 128 avgt 15 86.000 ms >> StringBuilders.toStringWithMixedChars >> 256 avgt 15 1185.884 ± 137.364 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 256 avgt 15 746.179 ± 92.534 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 256 avgt 15 1376.244 ± 0.125 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 256 avgt 15 579.137 ± 208.636 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 256 avgt 15 1055.150 ± 307.010 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 256 avgt 15 0.002 ± 0.002 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 256 avgt 15 0.003 ± 0.003 B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 256 avgt 15 126.000 counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 256 avgt 15 97.000 ms >> StringBuilders.toStringWithMixedChars >> 1024 avgt 15 4220.415 ± 440.169 ns/op >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate >> 1024 avgt 15 791.945 ± 75.231 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.alloc.rate.norm >> 1024 avgt 15 5217.435 ± 0.543 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space >> 1024 avgt 15 657.270 ± 235.803 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Eden_Space.norm >> 1024 avgt 15 4232.470 ± 1267.388 B/op >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space >> 1024 avgt 15 0.005 ± 0.002 MB/sec >> StringBuilders.toStringWithMixedChars:·gc.churn.G1_Survivor_Space.norm >> 1024 avgt 15 0.033 ± 0.014 B/op >> StringBuilders.toStringWithMixedChars:·gc.count >> 1024 avgt 15 202.000 counts >> StringBuilders.toStringWithMixedChars:·gc.time >> 1024 avgt 15 121.000 ms > > Xin Liu has updated the pull request incrementally with one additional commit > since the last revision: > > Change growOnly to maybeLatin. > > This patch also copys over the attribute from the other > AbstractStringBuilder. > Add a unit test to cover methods which cause maybeLatin1 becomes true.
I use **jol** to inspect the layout of StringBulder object. It won't increase object size because the extra boolean is in the alignment gap between coder and value. $java -jar ~/Devel/jol-cli-latest.jar internals java.lang.StringBuilder Instantiated the sample instance via default constructor. java.lang.StringBuilder object internals: OFF SZ TYPE DESCRIPTION VALUE 0 8 (object header: mark) 0x0000000000000001 (non-biasable; age: 0) 8 4 (object header: class) 0x000540e0 12 4 int AbstractStringBuilder.count 0 16 1 byte AbstractStringBuilder.coder 0 17 1 boolean AbstractStringBuilder.maybeLatin1 false 18 2 (alignment/padding gap) 20 4 byte[] AbstractStringBuilder.value [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Instance size: 24 bytes Space losses: 2 bytes internal + 0 bytes external = 2 bytes total > java -jar ~/Devel/jol-cli-latest.jar estimates java.lang.StringBuilder will iterate different setups of HotSpot. It's still same comparing to the baseline. I don't know a lot about Java Object layout. It looks like Java guarantees reference is 4-bytes alignment. ------------- PR: https://git.openjdk.java.net/jdk/pull/7671