On Wed, 30 Aug 2023 14:08:31 GMT, Claes Redestad <[email protected]> wrote:

>> The `URLEncodeDecode` microbenchmark accidentally generates strings with a 
>> lot of `'\u0000'` chars, heavily skewing towards strings that need to be 
>> encoded in a rather unrealistic what. To be more realistic the benchmark 
>> should test a mix of inputs.
>> 
>> This patch fixes these inadvertent cases, and sets up the benchmark for a 
>> healthier mix by default - adding controls to allow testing some mixed 
>> scenarios.
>> 
>> #15354 explore a few optimizations to `URLEncoder`, but due the nature of 
>> this microbenchmark a trivial fast-path scan for chars that need no encoding 
>> shows underwhelming results. With the modifications to this benchmark then a 
>> simple fast-path to `URLEncode.encode` shows a decent win when some or all 
>> the inputs remain unchanged:
>> 
>> 
>> Name                           (encodeChars) (maxLength) (unchanged) Cnt  
>> Base   Error   Test   Error  Unit  Diff%
>> URLEncodeDecode.testDecodeUTF8             6        1024           0  15 
>> 3,307 ± 0,507  3,010 ± 0,048 ms/op   9,0% (p = 0,030 )
>> URLEncodeDecode.testDecodeUTF8             6        1024          75  15 
>> 2,296 ± 0,003  2,313 ± 0,017 ms/op  -0,7% (p = 0,001*)
>> URLEncodeDecode.testDecodeUTF8             6        1024         100  15 
>> 0,812 ± 0,010  0,819 ± 0,017 ms/op  -0,8% (p = 0,201 )
>> URLEncodeDecode.testDecodeUTF8            35        1024           0  15 
>> 6,909 ± 0,065  7,192 ± 0,415 ms/op  -4,1% (p = 0,014 )
>> URLEncodeDecode.testDecodeUTF8            35        1024          75  15 
>> 3,346 ± 0,206  3,320 ± 0,270 ms/op   0,8% (p = 0,753 )
>> URLEncodeDecode.testDecodeUTF8            35        1024         100  15 
>> 0,794 ± 0,034  0,818 ± 0,015 ms/op  -3,0% (p = 0,016 )
>> URLEncodeDecode.testEncodeUTF8             6        1024           0  15 
>> 2,434 ± 0,019  2,579 ± 0,120 ms/op  -6,0% (p = 0,000*)
>> URLEncodeDecode.testEncodeUTF8             6        1024          75  15 
>> 1,764 ± 0,014  0,937 ± 0,012 ms/op  46,9% (p = 0,000*)
>> URLEncodeDecode.testEncodeUTF8             6        1024         100  15 
>> 1,227 ± 0,008  0,401 ± 0,001 ms/op  67,4% (p = 0,000*)
>> URLEncodeDecode.testEncodeUTF8            35        1024           0  15 
>> 6,177 ± 0,062  6,057 ± 0,199 ms/op   1,9% (p = 0,029 )
>> URLEncodeDecode.testEncodeUTF8            35        1024          75  15 
>> 2,716 ± 0,023  1,876 ± 0,012 ms/op  30,9% (p = 0,000*)
>> URLEncodeDecode.testEncodeUTF8            35        1024         100  15 
>> 1,220 ± 0,003  0,401 ± 0,001 ms/op  67,2% (p = 0,000*)
>> 
>> 
>> A potential future improvement would be to extend test data...
>
> Claes Redestad has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Minor cleanup (unused import, unnecessary casts)

Marked as reviewed by dfuchs (Reviewer).

Idle musing: I wonder if it would be useful to print on System.out how many 
strings actually have no encoded chars?
Just to get a rough feeling on whether the percentage values were actually 
respected.

-------------

PR Review: https://git.openjdk.org/jdk/pull/15448#pullrequestreview-1602811091
PR Comment: https://git.openjdk.org/jdk/pull/15448#issuecomment-1699262691

Reply via email to