On Mon, 18 Jan 2021 09:26:07 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> The `StringCoding.resultCached` mechanism is used to remove the allocation >> of a `StringCoding.Result` object on potentially hot paths used in some >> `String` constructors. Using a `ThreadLocal` has overheads though, and the >> overhead got a bit worse after JDK-8258596 which addresses a leak by adding >> a `SoftReference`. >> >> This patch refactors much of the decode logic back into `String` and gets >> rid of not only the `Result` cache, but the `Result` class itself along with >> the `StringDecoder` class and cache. >> >> Microbenchmark results: >> Baseline >> >> Benchmark (charsetName) Mode Cnt >> Score Error Units >> decodeCharset US-ASCII avgt 15 >> 193.043 ± 8.207 ns/op >> decodeCharset:·gc.alloc.rate.norm US-ASCII avgt 15 >> 112.009 ± 0.001 B/op >> decodeCharset ISO-8859-1 avgt 15 >> 164.580 ± 6.514 ns/op >> decodeCharset:·gc.alloc.rate.norm ISO-8859-1 avgt 15 >> 112.009 ± 0.001 B/op >> decodeCharset UTF-8 avgt 15 >> 328.370 ± 18.420 ns/op >> decodeCharset:·gc.alloc.rate.norm UTF-8 avgt 15 >> 224.019 ± 0.002 B/op >> decodeCharset MS932 avgt 15 >> 328.870 ± 20.056 ns/op >> decodeCharset:·gc.alloc.rate.norm MS932 avgt 15 >> 232.020 ± 0.002 B/op >> decodeCharset ISO-8859-6 avgt 15 >> 193.603 ± 12.305 ns/op >> decodeCharset:·gc.alloc.rate.norm ISO-8859-6 avgt 15 >> 112.010 ± 0.001 B/op >> decodeCharsetName US-ASCII avgt 15 >> 209.454 ± 9.040 ns/op >> decodeCharsetName:·gc.alloc.rate.norm US-ASCII avgt 15 >> 112.009 ± 0.001 B/op >> decodeCharsetName ISO-8859-1 avgt 15 >> 188.234 ± 7.533 ns/op >> decodeCharsetName:·gc.alloc.rate.norm ISO-8859-1 avgt 15 >> 112.009 ± 0.001 B/op >> decodeCharsetName UTF-8 avgt 15 >> 399.463 ± 12.437 ns/op >> decodeCharsetName:·gc.alloc.rate.norm UTF-8 avgt 15 >> 224.019 ± 0.003 B/op >> decodeCharsetName MS932 avgt 15 >> 358.839 ± 15.385 ns/op >> decodeCharsetName:·gc.alloc.rate.norm MS932 avgt 15 >> 208.017 ± 0.003 B/op >> decodeCharsetName ISO-8859-6 avgt 15 >> 162.570 ± 7.090 ns/op >> decodeCharsetName:·gc.alloc.rate.norm ISO-8859-6 avgt 15 >> 112.009 ± 0.001 B/op >> decodeDefault N/A avgt 15 >> 316.081 ± 11.101 ns/op >> decodeDefault:·gc.alloc.rate.norm N/A avgt 15 >> 224.019 ± 0.002 B/op >> >> Patched: >> Benchmark (charsetName) Mode Cnt >> Score Error Units >> decodeCharset US-ASCII avgt 15 >> 159.153 ± 6.082 ns/op >> decodeCharset:·gc.alloc.rate.norm US-ASCII avgt 15 >> 112.009 ± 0.001 B/op >> decodeCharset ISO-8859-1 avgt 15 >> 134.433 ± 6.203 ns/op >> decodeCharset:·gc.alloc.rate.norm ISO-8859-1 avgt 15 >> 112.009 ± 0.001 B/op >> decodeCharset UTF-8 avgt 15 >> 297.234 ± 21.654 ns/op >> decodeCharset:·gc.alloc.rate.norm UTF-8 avgt 15 >> 224.019 ± 0.002 B/op >> decodeCharset MS932 avgt 15 >> 311.583 ± 16.445 ns/op >> decodeCharset:·gc.alloc.rate.norm MS932 avgt 15 >> 208.018 ± 0.002 B/op >> decodeCharset ISO-8859-6 avgt 15 >> 156.216 ± 6.522 ns/op >> decodeCharset:·gc.alloc.rate.norm ISO-8859-6 avgt 15 >> 112.010 ± 0.001 B/op >> decodeCharsetName US-ASCII avgt 15 >> 180.752 ± 9.411 ns/op >> decodeCharsetName:·gc.alloc.rate.norm US-ASCII avgt 15 >> 112.010 ± 0.001 B/op >> decodeCharsetName ISO-8859-1 avgt 15 >> 156.170 ± 8.003 ns/op >> decodeCharsetName:·gc.alloc.rate.norm ISO-8859-1 avgt 15 >> 112.010 ± 0.001 B/op >> decodeCharsetName UTF-8 avgt 15 >> 370.337 ± 22.199 ns/op >> decodeCharsetName:·gc.alloc.rate.norm UTF-8 avgt 15 >> 224.019 ± 0.001 B/op >> decodeCharsetName MS932 avgt 15 >> 312.589 ± 15.067 ns/op >> decodeCharsetName:·gc.alloc.rate.norm MS932 avgt 15 >> 208.018 ± 0.002 B/op >> decodeCharsetName ISO-8859-6 avgt 15 >> 173.205 ± 9.647 ns/op >> decodeCharsetName:·gc.alloc.rate.norm ISO-8859-6 avgt 15 >> 112.009 ± 0.001 B/op >> decodeDefault N/A avgt 15 >> 303.459 ± 16.452 ns/op >> decodeDefault:·gc.alloc.rate.norm N/A avgt 15 >> 224.019 ± 0.001 B/op >> >> Most variants improve. There's a small added overhead in `String >> charsetName` variants for some charsets such as `ISO-8859-6` that benefited >> slightly from the `StringDecoder` cache due avoiding a lookup, but most >> variants are not helped by this cache and instead see a significant gain >> from skipping that step. `Charset` variants don't need a lookup and improve >> across the board. >> >> Another drawback is that we need to cram more logic into `String` to bypass >> the `StringCoding.Result` indirection - but getting rid of two commonly used >> `ThreadLocal` caches and most cases getting a bit better raw throughput in >> the process I think more than enough makes up for that. >> >> Testing: tier1-4 > > Claes Redestad has updated the pull request incrementally with one additional > commit since the last revision: > > Harmonize empty string checking in newString methods The large number of package exposed methods in StringCoding is ugly and makes the code harder to maintain. Can the code duplication of UTF8 in the String constructors be reduced? It would be cleaner to move all of the remaining StringCoding methods into String and make them private again. Reading the code now requires quite a bit of cross referencing and the invariants are hard to verify. src/java.base/share/classes/java/lang/StringCoding.java line 424: > 422: > 423: static int decodeUTF8_UTF16(byte[] bytes, int offset, int sl, byte[] > dst, int dp, boolean doReplace) { > 424: while (offset < sl) { The renaming of sp -> offset seems unmotivated and breaks the symmetry with dp. ------------- PR: https://git.openjdk.java.net/jdk/pull/2102