On Fri, 1 Aug 2025 12:39:05 GMT, Brett Okken <[email protected]> wrote:
>> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can >> count the leading positive bytes and in the case where there is a negative, >> we can copy all the positive values to the target byte[] prior to processing >> the remaining data 1 byte at a time. >> >> https://mail.openjdk.org/pipermail/core-libs-dev/2025-July/149417.html > > Benchmark on win64 > > Baseline: > > > Benchmark (charsetName) Mode Cnt Score > Error Units > StringEncode.encodeAllMixed UTF-8 avgt 10 20067.519 ┬▒ > 528.152 ns/op > StringEncode.encodeAsciiLong UTF-8 avgt 10 12115.389 ┬▒ > 307.491 ns/op > StringEncode.encodeAsciiShort UTF-8 avgt 10 70.098 ┬▒ > 1.696 ns/op > StringEncode.encodeLatin1LongEnd UTF-8 avgt 10 1974.391 ┬▒ > 162.405 ns/op > StringEncode.encodeLatin1LongOnly UTF-8 avgt 10 270.097 ┬▒ > 13.840 ns/op > StringEncode.encodeLatin1LongStart UTF-8 avgt 10 1876.366 ┬▒ > 51.971 ns/op > StringEncode.encodeLatin1Mixed UTF-8 avgt 10 4973.070 ┬▒ > 130.426 ns/op > StringEncode.encodeLatin1Short UTF-8 avgt 10 96.227 ┬▒ > 2.816 ns/op > StringEncode.encodeShortMixed UTF-8 avgt 10 360.586 ┬▒ > 8.691 ns/op > StringEncode.encodeUTF16LongEnd UTF-8 avgt 10 1534.748 ┬▒ > 34.584 ns/op > StringEncode.encodeUTF16LongOnly UTF-8 avgt 10 528.919 ┬▒ > 15.143 ns/op > StringEncode.encodeUTF16LongStart UTF-8 avgt 10 2275.117 ┬▒ > 50.152 ns/op > StringEncode.encodeUTF16Mixed UTF-8 avgt 10 4398.943 ┬▒ > 116.607 ns/op > StringEncode.encodeUTF16Short UTF-8 avgt 10 152.219 ┬▒ > 8.677 ns/op > > > > Patch: > > Benchmark (charsetName) Mode Cnt Score > Error Units > StringEncode.encodeAllMixed UTF-8 avgt 10 18876.056 ┬▒ > 330.644 ns/op > StringEncode.encodeAsciiLong UTF-8 avgt 10 12040.590 ┬▒ > 165.905 ns/op > StringEncode.encodeAsciiShort UTF-8 avgt 10 69.895 ┬▒ > 0.318 ns/op > StringEncode.encodeLatin1LongEnd UTF-8 avgt 10 574.455 ┬▒ > 14.769 ns/op > StringEncode.encodeLatin1LongOnly UTF-8 avgt 10 284.553 ┬▒ > 1.886 ns/op > StringEncode.encodeLatin1LongStart UTF-8 avgt 10 2230.789 ┬▒ > 11.043 ns/op > StringEncode.encodeLatin1Mixed UTF-8 avgt 10 3278.998 ┬▒ > 96.779 ns/op > StringEncode.encodeLatin1Short UTF-8 avgt 10 99.332 ┬▒ > 1.977 ns/op > StringEncode.encodeShortMixed UTF-8 avgt 10 378.183 ┬▒ > 17.504 ns/op > StringEncode.encodeUTF16LongEnd UTF-8 avgt 10 1531.960 ┬▒ > 19.300 ns/op > StringEncode.encodeUTF16LongOnly UTF-8 avgt 10 563.810 ┬▒ > 4.811 ns/op > StringEncode.encodeUTF16LongS... @bokken FYI to make JMH comparison easier, you can let JMH generate JSON reports, upload them to github gists, and use https://jmh.morethan.io/ to compare the two results from two gists. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26597#issuecomment-3145088238
