On Mon, 28 Aug 2023 13:33:46 GMT, Claes Redestad <[email protected]> wrote:
> The `URLEncodeDecode` microbenchmark accidentally generates strings with a > lot of `'\u0000'` chars, heavily skewing towards strings that need to be > encoded in a rather unrealistic what. To be more realistic the benchmark > should test a mix of inputs. > > This patch fixes these inadvertent cases, and sets up the benchmark for a > healthier mix by default - adding controls to allow testing some mixed > scenarios. > > #15354 explore a few optimizations to `URLEncoder`, but due the nature of > this microbenchmark a trivial fast-path scan for chars that need no encoding > shows underwhelming results. With the modifications to this benchmark then a > simple fast-path to `URLEncode.encode` shows a decent win when some or all > the inputs remain unchanged: > > > Name (encodeChars) (maxLength) (unchanged) Cnt > Base Error Test Error Unit Diff% > URLEncodeDecode.testDecodeUTF8 6 1024 0 15 > 3,307 ± 0,507 3,010 ± 0,048 ms/op 9,0% (p = 0,030 ) > URLEncodeDecode.testDecodeUTF8 6 1024 75 15 > 2,296 ± 0,003 2,313 ± 0,017 ms/op -0,7% (p = 0,001*) > URLEncodeDecode.testDecodeUTF8 6 1024 100 15 > 0,812 ± 0,010 0,819 ± 0,017 ms/op -0,8% (p = 0,201 ) > URLEncodeDecode.testDecodeUTF8 35 1024 0 15 > 6,909 ± 0,065 7,192 ± 0,415 ms/op -4,1% (p = 0,014 ) > URLEncodeDecode.testDecodeUTF8 35 1024 75 15 > 3,346 ± 0,206 3,320 ± 0,270 ms/op 0,8% (p = 0,753 ) > URLEncodeDecode.testDecodeUTF8 35 1024 100 15 > 0,794 ± 0,034 0,818 ± 0,015 ms/op -3,0% (p = 0,016 ) > URLEncodeDecode.testEncodeUTF8 6 1024 0 15 > 2,434 ± 0,019 2,579 ± 0,120 ms/op -6,0% (p = 0,000*) > URLEncodeDecode.testEncodeUTF8 6 1024 75 15 > 1,764 ± 0,014 0,937 ± 0,012 ms/op 46,9% (p = 0,000*) > URLEncodeDecode.testEncodeUTF8 6 1024 100 15 > 1,227 ± 0,008 0,401 ± 0,001 ms/op 67,4% (p = 0,000*) > URLEncodeDecode.testEncodeUTF8 35 1024 0 15 > 6,177 ± 0,062 6,057 ± 0,199 ms/op 1,9% (p = 0,029 ) > URLEncodeDecode.testEncodeUTF8 35 1024 75 15 > 2,716 ± 0,023 1,876 ± 0,012 ms/op 30,9% (p = 0,000*) > URLEncodeDecode.testEncodeUTF8 35 1024 100 15 > 1,220 ± 0,003 0,401 ± 0,001 ms/op 67,2% (p = 0,000*) > > > A potential future improvement would be to extend test data with varying > amounts of surrogate pairs, e.g.... Thanks for improving this microbenchmark Claes. Changes look good to me. Using 1024 strings should ensure that at least some of them have some characters that need to be encoded/decoded. ------------- Marked as reviewed by dfuchs (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15448#pullrequestreview-1602752757
