Re: RFR: 8258917: NativeMemoryTracking is handled by launcher inconsistenly

2021-01-15 Thread Zhengyu Gu
On Fri, 15 Jan 2021 23:50:16 GMT, Alex Menkov  wrote:

> The fix adds NMT handling for non-java launchers

Looks good

-

Marked as reviewed by zgu (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2106


Re: RFR: 8259842: Remove Result cache from StringCoding [v4]

2021-01-15 Thread Claes Redestad
On Sat, 16 Jan 2021 00:25:24 GMT, Peter Levart  wrote:

> Do you think this code belongs more to String than to StringCoding?

I consider StringCoding an implementation detail of String, so I'm not sure 
there's (much) value in maintaining the separation of concern if it is cause 
for a performance loss. While encapsulating and separating concerns is a fine 
purpose I think the main purpose served by StringCoding is to resolve some 
bootstrap issues: String is one of the first classes to be loaded and eagerly 
pulling in dependencies like ThreadLocal and Charsets before bootstrapping 
leads to all manner of hard to decipher issues (yes - I've tried :-)).

-

PR: https://git.openjdk.java.net/jdk/pull/2102


Re: RFR: 8259842: Remove Result cache from StringCoding [v4]

2021-01-15 Thread Claes Redestad
> The `StringCoding.resultCached` mechanism is used to remove the allocation of 
> a `StringCoding.Result` object on potentially hot paths used in some `String` 
> constructors. Using a `ThreadLocal` has overheads though, and the overhead 
> got a bit worse after JDK-8258596 which addresses a leak by adding a 
> `SoftReference`.
> 
> This patch refactors much of the decode logic back into `String` and gets rid 
> of not only the `Result` cache, but the `Result` class itself along with the 
> `StringDecoder` class and cache.
> 
> Microbenchmark results:
> Baseline
> 
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 193.043 ±  8.207   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 164.580 ±  6.514   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 328.370 ± 18.420   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 328.870 ± 20.056   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 232.020 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 193.603 ± 12.305   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 209.454 ±  9.040   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 188.234 ±  7.533   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 399.463 ± 12.437   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.003B/op
> decodeCharsetName   MS932  avgt   15  
> 358.839 ± 15.385   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.017 ±  0.003B/op
> decodeCharsetName  ISO-8859-6  avgt   15  
> 162.570 ±  7.090   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.009 ±  0.001B/op
> decodeDefault N/A  avgt   15  
> 316.081 ± 11.101   ns/op
> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15  
> 224.019 ±  0.002B/op
> 
> Patched:
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 159.153 ±  6.082   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 134.433 ±  6.203   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 297.234 ± 21.654   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 311.583 ± 16.445   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.018 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 156.216 ±  6.522   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 180.752 ±  9.411   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 156.170 ±  8.003   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 370.337 ± 22.199   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.001B/op
> decodeCharsetName   MS932  avgt   15  
> 312.589 ±

Re: RFR: JDK-8259238: Clean up Log.java and remove usage of non-final static variables.

2021-01-15 Thread Alexander Zuev
On Thu, 7 Jan 2021 16:40:26 GMT, Andy Herrick  wrote:

> JDK-8259238: Clean up Log.java and remove usage of non-final static variables.

Marked as reviewed by kizune (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/1977


Re: RFR: JDK-8258755: jpackage: Invalid 32-bit exe when building app-image

2021-01-15 Thread Alexander Zuev
On Mon, 11 Jan 2021 17:42:21 GMT, Andy Herrick  wrote:

> JDK-8258755: jpackage: Invalid 32-bit exe when building app-image

Marked as reviewed by kizune (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/2030


Re: RFR: JDK-8259062: Remove MacAppStoreBundler

2021-01-15 Thread Alexander Zuev
On Wed, 6 Jan 2021 15:52:07 GMT, Andy Herrick  wrote:

> JDK-8259062: Remove MacAppStoreBundler

Marked as reviewed by kizune (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/1962


Re: [jdk16] RFR: JDK-8259732: JDK 16 L10n resource file update - msg drop 10 [v2]

2021-01-15 Thread Naoto Sato
On Fri, 15 Jan 2021 01:59:07 GMT, Leo Jiang  wrote:

>> src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/resources/standard.properties
>>  line 518:
>> 
>>> 516: doclet.footer_specified=\
>>> 517: The -footer option is no longer supported and will be ignored.\n\
>>> 518: It may be removed in a future release.
>> 
>> I believe this is to fix no newline at the end (unrelated to l10n changes). 
>> Do we need to change the copyright year for this?
>
> Actually I was correcting the L217, changed {) -> {}. But 2 days ago, 
> Jonathan Gibbons fixed it in another commit 6bb6093. I found his fix after 
> running the merge. Looks both of us forgot to update the copyright year. Any 
> suggestion? 
> doclet.help.systemProperties.body=\
> The {0} page lists references to system properties.

I believe your PR still contains the fix to add the newline at the end of the 
file (at L518). So I think you can simply change the copyright year in 
`standard.properties` file.

-

PR: https://git.openjdk.java.net/jdk16/pull/123


Re: RFR: 8259842: Remove Result cache from StringCoding [v3]

2021-01-15 Thread Peter Levart
On Fri, 15 Jan 2021 22:03:31 GMT, Claes Redestad  wrote:

>> Hi Claes,
>> This is quite an undertaking in re-factoring!
>> I think that decoding logic that you produced can still be kept in 
>> StringCoding class though. The private constructor that you created for 
>> UTF-8 decoding is unnecessary. The logic can be kept in a static method and 
>> then the String instance constructed in it from the value and coder. The 
>> only public constructor that remains is then the one taking a Charset 
>> parameter. I took your code and moved it back to StringCoding while for the 
>> mentioned constructor I tried the following trick which could work (I 
>> haven't tested yet with JMH): if you pass an instance of newly constructed 
>> object down to a method that is inlined into the caller and that method does 
>> not pass that object to any other methods, JIT will eliminate the heap 
>> allocation, so I suspect that you can pass results from the called method 
>> back in that object and still avoid allocation...
>> 
>> https://github.com/plevart/jdk/commit/691600e3789a4c2b52fae921be87d0d7affa6f0f
>> 
>> WDYT?
>
>> WDYT?
> 
> I get that the approach I took got a bit messy, but I've just spent some time 
> cleaning it up. Please have a look at the latest, which outlines much of the 
> logic and consolidates the replace/throw logic in the UTF8 decode paths. I've 
> checked it does not regress on the micro, and I think the overall state of 
> the code now isn't much messier than the original code.

Some common logic is now extracted into methods. That's good. But much of the 
decoding logic still remains in String. I still think all of static methods can 
be moved to StringCoding directly as they are now while the private UTF-8 
decoding constructor can be replaced with static method and moved to 
StringCoding. The only problem might be the public String constructor taking 
Charset parameter. But even here, passing a newly constructed mutable object to 
the method can be used to return multiple values from the method while relying 
on JIT to eliminate object allocation.
Do you think this code belongs more to String than to StringCoding?

-

PR: https://git.openjdk.java.net/jdk/pull/2102


Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests [v2]

2021-01-15 Thread Valerie Peng
On Fri, 15 Jan 2021 23:36:35 GMT, Claes Redestad  wrote:

>> - The MD5 intrinsics added by 
>> [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that 
>> the `int[] x` isn't actually needed. This also applies to the SHA intrinsics 
>> from which the MD5 intrinsic takes inspiration
>> - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to 
>> make it acceptable to use inline and replace the array in MD5 wholesale. 
>> This improves performance both in the presence and the absence of the 
>> intrinsic optimization.
>> - Doing the exact same thing in the SHA impls would be unwieldy (64+ element 
>> arrays), but allocating the array lazily gets most of the speed-up in the 
>> presence of an intrinsic while being neutral in its absence.
>> 
>> Baseline:
>>   (digesterName)  (length)Cnt Score  
>> Error   Units
>> MessageDigests.digestMD516 15  
>> 2714.307 ±   21.133  ops/ms
>> MessageDigests.digestMD5  1024 15   
>> 318.087 ±0.637  ops/ms
>> MessageDigests.digest  SHA-116 15  
>> 1387.266 ±   40.932  ops/ms
>> MessageDigests.digest  SHA-1  1024 15   
>> 109.273 ±0.149  ops/ms
>> MessageDigests.digestSHA-25616 15   
>> 995.566 ±   21.186  ops/ms
>> MessageDigests.digestSHA-256  1024 15
>> 89.104 ±0.079  ops/ms
>> MessageDigests.digestSHA-51216 15   
>> 803.030 ±   15.722  ops/ms
>> MessageDigests.digestSHA-512  1024 15   
>> 115.611 ±0.234  ops/ms
>> MessageDigests.getAndDigest  MD516 15  
>> 2190.367 ±   97.037  ops/ms
>> MessageDigests.getAndDigest  MD5  1024 15   
>> 302.903 ±1.809  ops/ms
>> MessageDigests.getAndDigestSHA-116 15  
>> 1262.656 ±   43.751  ops/ms
>> MessageDigests.getAndDigestSHA-1  1024 15   
>> 104.889 ±3.554  ops/ms
>> MessageDigests.getAndDigest  SHA-25616 15   
>> 914.541 ±   55.621  ops/ms
>> MessageDigests.getAndDigest  SHA-256  1024 15
>> 85.708 ±1.394  ops/ms
>> MessageDigests.getAndDigest  SHA-51216 15   
>> 737.719 ±   53.671  ops/ms
>> MessageDigests.getAndDigest  SHA-512  1024 15   
>> 112.307 ±1.950  ops/ms
>> 
>> GC:
>> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
>> 312.011 ±0.005B/op
>> MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15   
>> 584.020 ±0.006B/op
>> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-25616 15   
>> 544.019 ±0.016B/op
>> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-51216 15  
>> 1056.037 ±0.003B/op
>> 
>> Target:
>> Benchmark (digesterName)  (length)Cnt
>>  Score  Error   Units
>> MessageDigests.digestMD516 15  
>> 3134.462 ±   43.685  ops/ms
>> MessageDigests.digestMD5  1024 15   
>> 323.667 ±0.633  ops/ms
>> MessageDigests.digest  SHA-116 15  
>> 1418.742 ±   38.223  ops/ms
>> MessageDigests.digest  SHA-1  1024 15   
>> 110.178 ±0.788  ops/ms
>> MessageDigests.digestSHA-25616 15  
>> 1037.949 ±   21.214  ops/ms
>> MessageDigests.digestSHA-256  1024 15
>> 89.671 ±0.228  ops/ms
>> MessageDigests.digestSHA-51216 15   
>> 812.028 ±   39.489  ops/ms
>> MessageDigests.digestSHA-512  1024 15   
>> 116.738 ±0.249  ops/ms
>> MessageDigests.getAndDigest  MD516 15  
>> 2314.379 ±  229.294  ops/ms
>> MessageDigests.getAndDigest  MD5  1024 15   
>> 307.835 ±5.730  ops/ms
>> MessageDigests.getAndDigestSHA-116 15  
>> 1326.887 ±   63.263  ops/ms
>> MessageDigests.getAndDigestSHA-1  1024 15   
>> 106.611 ±2.292  ops/ms
>> MessageDigests.getAndDigest  SHA-25616 15   
>> 961.589 ±   82.052  ops/ms
>> MessageDigests.getAndDigest  SHA-256  1024 15
>> 88.646 ±0.194  ops/ms
>> MessageDigests.getAndDigest  SHA-51216 15   
>> 775.417 ±   56.775  ops/ms
>> MessageDigests.getAndDigest  SHA-512  1024 15   
>> 112.904 ±2.014  ops/ms
>> 
>>

RFR: 8258917: NativeMemoryTracking is handled by launcher inconsistenly

2021-01-15 Thread Alex Menkov
The fix adds NMT handling for non-java launchers

-

Commit messages:
 - Handle NMT for non-java launchers

Changes: https://git.openjdk.java.net/jdk/pull/2106/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2106&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258917
  Stats: 16 lines in 2 files changed: 8 ins; 1 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2106.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2106/head:pull/2106

PR: https://git.openjdk.java.net/jdk/pull/2106


Re: RFR: 8258917: NativeMemoryTracking is handled by launcher inconsistenly

2021-01-15 Thread Alex Menkov
On Fri, 15 Jan 2021 23:50:16 GMT, Alex Menkov  wrote:

> The fix adds NMT handling for non-java launchers

Added serviceability as serviceability tools use launcher functionality

-

PR: https://git.openjdk.java.net/jdk/pull/2106


Re: RFR: 8259842: Remove Result cache from StringCoding [v3]

2021-01-15 Thread Claes Redestad
> The `StringCoding.resultCached` mechanism is used to remove the allocation of 
> a `StringCoding.Result` object on potentially hot paths used in some `String` 
> constructors. Using a `ThreadLocal` has overheads though, and the overhead 
> got a bit worse after JDK-8258596 which addresses a leak by adding a 
> `SoftReference`.
> 
> This patch refactors much of the decode logic back into `String` and gets rid 
> of not only the `Result` cache, but the `Result` class itself along with the 
> `StringDecoder` class and cache.
> 
> Microbenchmark results:
> Baseline
> 
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 193.043 ±  8.207   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 164.580 ±  6.514   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 328.370 ± 18.420   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 328.870 ± 20.056   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 232.020 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 193.603 ± 12.305   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 209.454 ±  9.040   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 188.234 ±  7.533   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 399.463 ± 12.437   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.003B/op
> decodeCharsetName   MS932  avgt   15  
> 358.839 ± 15.385   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.017 ±  0.003B/op
> decodeCharsetName  ISO-8859-6  avgt   15  
> 162.570 ±  7.090   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.009 ±  0.001B/op
> decodeDefault N/A  avgt   15  
> 316.081 ± 11.101   ns/op
> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15  
> 224.019 ±  0.002B/op
> 
> Patched:
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 159.153 ±  6.082   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 134.433 ±  6.203   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 297.234 ± 21.654   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 311.583 ± 16.445   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.018 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 156.216 ±  6.522   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 180.752 ±  9.411   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 156.170 ±  8.003   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 370.337 ± 22.199   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.001B/op
> decodeCharsetName   MS932  avgt   15  
> 312.589 ±

Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests [v2]

2021-01-15 Thread Claes Redestad
> - The MD5 intrinsics added by 
> [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that 
> the `int[] x` isn't actually needed. This also applies to the SHA intrinsics 
> from which the MD5 intrinsic takes inspiration
> - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to 
> make it acceptable to use inline and replace the array in MD5 wholesale. This 
> improves performance both in the presence and the absence of the intrinsic 
> optimization.
> - Doing the exact same thing in the SHA impls would be unwieldy (64+ element 
> arrays), but allocating the array lazily gets most of the speed-up in the 
> presence of an intrinsic while being neutral in its absence.
> 
> Baseline:
>   (digesterName)  (length)Cnt Score  
> Error   Units
> MessageDigests.digestMD516 15  
> 2714.307 ±   21.133  ops/ms
> MessageDigests.digestMD5  1024 15   
> 318.087 ±0.637  ops/ms
> MessageDigests.digest  SHA-116 15  
> 1387.266 ±   40.932  ops/ms
> MessageDigests.digest  SHA-1  1024 15   
> 109.273 ±0.149  ops/ms
> MessageDigests.digestSHA-25616 15   
> 995.566 ±   21.186  ops/ms
> MessageDigests.digestSHA-256  1024 15
> 89.104 ±0.079  ops/ms
> MessageDigests.digestSHA-51216 15   
> 803.030 ±   15.722  ops/ms
> MessageDigests.digestSHA-512  1024 15   
> 115.611 ±0.234  ops/ms
> MessageDigests.getAndDigest  MD516 15  
> 2190.367 ±   97.037  ops/ms
> MessageDigests.getAndDigest  MD5  1024 15   
> 302.903 ±1.809  ops/ms
> MessageDigests.getAndDigestSHA-116 15  
> 1262.656 ±   43.751  ops/ms
> MessageDigests.getAndDigestSHA-1  1024 15   
> 104.889 ±3.554  ops/ms
> MessageDigests.getAndDigest  SHA-25616 15   
> 914.541 ±   55.621  ops/ms
> MessageDigests.getAndDigest  SHA-256  1024 15
> 85.708 ±1.394  ops/ms
> MessageDigests.getAndDigest  SHA-51216 15   
> 737.719 ±   53.671  ops/ms
> MessageDigests.getAndDigest  SHA-512  1024 15   
> 112.307 ±1.950  ops/ms
> 
> GC:
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
> 312.011 ±0.005B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15   
> 584.020 ±0.006B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-25616 15   
> 544.019 ±0.016B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-51216 15  
> 1056.037 ±0.003B/op
> 
> Target:
> Benchmark (digesterName)  (length)Cnt 
> Score  Error   Units
> MessageDigests.digestMD516 15  
> 3134.462 ±   43.685  ops/ms
> MessageDigests.digestMD5  1024 15   
> 323.667 ±0.633  ops/ms
> MessageDigests.digest  SHA-116 15  
> 1418.742 ±   38.223  ops/ms
> MessageDigests.digest  SHA-1  1024 15   
> 110.178 ±0.788  ops/ms
> MessageDigests.digestSHA-25616 15  
> 1037.949 ±   21.214  ops/ms
> MessageDigests.digestSHA-256  1024 15
> 89.671 ±0.228  ops/ms
> MessageDigests.digestSHA-51216 15   
> 812.028 ±   39.489  ops/ms
> MessageDigests.digestSHA-512  1024 15   
> 116.738 ±0.249  ops/ms
> MessageDigests.getAndDigest  MD516 15  
> 2314.379 ±  229.294  ops/ms
> MessageDigests.getAndDigest  MD5  1024 15   
> 307.835 ±5.730  ops/ms
> MessageDigests.getAndDigestSHA-116 15  
> 1326.887 ±   63.263  ops/ms
> MessageDigests.getAndDigestSHA-1  1024 15   
> 106.611 ±2.292  ops/ms
> MessageDigests.getAndDigest  SHA-25616 15   
> 961.589 ±   82.052  ops/ms
> MessageDigests.getAndDigest  SHA-256  1024 15
> 88.646 ±0.194  ops/ms
> MessageDigests.getAndDigest  SHA-51216 15   
> 775.417 ±   56.775  ops/ms
> MessageDigests.getAndDigest  SHA-512  1024 15   
> 112.904 ±2.014  ops/ms
> 
> GC
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
> 232.009 ±0.006B/op
> MessageDigests.getAndDigest:·gc.alloc.r

Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests [v2]

2021-01-15 Thread Claes Redestad
On Fri, 15 Jan 2021 23:21:00 GMT, Valerie Peng  wrote:

>> Claes Redestad has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains 20 additional 
>> commits since the last revision:
>> 
>>  - Copyrights
>>  - Merge branch 'master' into improve_md5
>>  - Remove unused Unsafe import
>>  - Harmonize MD4 impl, remove now-redundant checks from ByteArrayAccess (VHs 
>> do bounds checks, most of which will be optimized away)
>>  - Merge branch 'master' into improve_md5
>>  - Apply allocation avoiding optimizations to all SHA versions sharing 
>> structural similarities with MD5
>>  - Remove unused reverseBytes imports
>>  - Copyrights
>>  - Fix copy-paste error
>>  - Various fixes (IDE stopped IDEing..)
>>  - ... and 10 more: 
>> https://git.openjdk.java.net/jdk/compare/6e03c8d3...cafa3e49
>
> test/micro/org/openjdk/bench/java/util/UUIDBench.java line 2:
> 
>> 1: /*
>> 2:  * Copyright (c) 2020, 2021, Oracle and/or its affiliates. All rights 
>> reserved.
> 
> nit: other files should also have this 2021 update. It seems most of them are 
> not updated and still uses 2020.

fixed

-

PR: https://git.openjdk.java.net/jdk/pull/1855


Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests

2021-01-15 Thread Valerie Peng
On Sun, 20 Dec 2020 20:27:03 GMT, Claes Redestad  wrote:

> - The MD5 intrinsics added by 
> [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that 
> the `int[] x` isn't actually needed. This also applies to the SHA intrinsics 
> from which the MD5 intrinsic takes inspiration
> - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to 
> make it acceptable to use inline and replace the array in MD5 wholesale. This 
> improves performance both in the presence and the absence of the intrinsic 
> optimization.
> - Doing the exact same thing in the SHA impls would be unwieldy (64+ element 
> arrays), but allocating the array lazily gets most of the speed-up in the 
> presence of an intrinsic while being neutral in its absence.
> 
> Baseline:
>   (digesterName)  (length)Cnt Score  
> Error   Units
> MessageDigests.digestMD516 15  
> 2714.307 ±   21.133  ops/ms
> MessageDigests.digestMD5  1024 15   
> 318.087 ±0.637  ops/ms
> MessageDigests.digest  SHA-116 15  
> 1387.266 ±   40.932  ops/ms
> MessageDigests.digest  SHA-1  1024 15   
> 109.273 ±0.149  ops/ms
> MessageDigests.digestSHA-25616 15   
> 995.566 ±   21.186  ops/ms
> MessageDigests.digestSHA-256  1024 15
> 89.104 ±0.079  ops/ms
> MessageDigests.digestSHA-51216 15   
> 803.030 ±   15.722  ops/ms
> MessageDigests.digestSHA-512  1024 15   
> 115.611 ±0.234  ops/ms
> MessageDigests.getAndDigest  MD516 15  
> 2190.367 ±   97.037  ops/ms
> MessageDigests.getAndDigest  MD5  1024 15   
> 302.903 ±1.809  ops/ms
> MessageDigests.getAndDigestSHA-116 15  
> 1262.656 ±   43.751  ops/ms
> MessageDigests.getAndDigestSHA-1  1024 15   
> 104.889 ±3.554  ops/ms
> MessageDigests.getAndDigest  SHA-25616 15   
> 914.541 ±   55.621  ops/ms
> MessageDigests.getAndDigest  SHA-256  1024 15
> 85.708 ±1.394  ops/ms
> MessageDigests.getAndDigest  SHA-51216 15   
> 737.719 ±   53.671  ops/ms
> MessageDigests.getAndDigest  SHA-512  1024 15   
> 112.307 ±1.950  ops/ms
> 
> GC:
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
> 312.011 ±0.005B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15   
> 584.020 ±0.006B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-25616 15   
> 544.019 ±0.016B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-51216 15  
> 1056.037 ±0.003B/op
> 
> Target:
> Benchmark (digesterName)  (length)Cnt 
> Score  Error   Units
> MessageDigests.digestMD516 15  
> 3134.462 ±   43.685  ops/ms
> MessageDigests.digestMD5  1024 15   
> 323.667 ±0.633  ops/ms
> MessageDigests.digest  SHA-116 15  
> 1418.742 ±   38.223  ops/ms
> MessageDigests.digest  SHA-1  1024 15   
> 110.178 ±0.788  ops/ms
> MessageDigests.digestSHA-25616 15  
> 1037.949 ±   21.214  ops/ms
> MessageDigests.digestSHA-256  1024 15
> 89.671 ±0.228  ops/ms
> MessageDigests.digestSHA-51216 15   
> 812.028 ±   39.489  ops/ms
> MessageDigests.digestSHA-512  1024 15   
> 116.738 ±0.249  ops/ms
> MessageDigests.getAndDigest  MD516 15  
> 2314.379 ±  229.294  ops/ms
> MessageDigests.getAndDigest  MD5  1024 15   
> 307.835 ±5.730  ops/ms
> MessageDigests.getAndDigestSHA-116 15  
> 1326.887 ±   63.263  ops/ms
> MessageDigests.getAndDigestSHA-1  1024 15   
> 106.611 ±2.292  ops/ms
> MessageDigests.getAndDigest  SHA-25616 15   
> 961.589 ±   82.052  ops/ms
> MessageDigests.getAndDigest  SHA-256  1024 15
> 88.646 ±0.194  ops/ms
> MessageDigests.getAndDigest  SHA-51216 15   
> 775.417 ±   56.775  ops/ms
> MessageDigests.getAndDigest  SHA-512  1024 15   
> 112.904 ±2.014  ops/ms
> 
> GC
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
> 232.009 ± 

Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests

2021-01-15 Thread Claes Redestad
On Fri, 15 Jan 2021 22:54:32 GMT, Valerie Peng  wrote:

>> - The MD5 intrinsics added by 
>> [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that 
>> the `int[] x` isn't actually needed. This also applies to the SHA intrinsics 
>> from which the MD5 intrinsic takes inspiration
>> - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to 
>> make it acceptable to use inline and replace the array in MD5 wholesale. 
>> This improves performance both in the presence and the absence of the 
>> intrinsic optimization.
>> - Doing the exact same thing in the SHA impls would be unwieldy (64+ element 
>> arrays), but allocating the array lazily gets most of the speed-up in the 
>> presence of an intrinsic while being neutral in its absence.
>> 
>> Baseline:
>>   (digesterName)  (length)Cnt Score  
>> Error   Units
>> MessageDigests.digestMD516 15  
>> 2714.307 ±   21.133  ops/ms
>> MessageDigests.digestMD5  1024 15   
>> 318.087 ±0.637  ops/ms
>> MessageDigests.digest  SHA-116 15  
>> 1387.266 ±   40.932  ops/ms
>> MessageDigests.digest  SHA-1  1024 15   
>> 109.273 ±0.149  ops/ms
>> MessageDigests.digestSHA-25616 15   
>> 995.566 ±   21.186  ops/ms
>> MessageDigests.digestSHA-256  1024 15
>> 89.104 ±0.079  ops/ms
>> MessageDigests.digestSHA-51216 15   
>> 803.030 ±   15.722  ops/ms
>> MessageDigests.digestSHA-512  1024 15   
>> 115.611 ±0.234  ops/ms
>> MessageDigests.getAndDigest  MD516 15  
>> 2190.367 ±   97.037  ops/ms
>> MessageDigests.getAndDigest  MD5  1024 15   
>> 302.903 ±1.809  ops/ms
>> MessageDigests.getAndDigestSHA-116 15  
>> 1262.656 ±   43.751  ops/ms
>> MessageDigests.getAndDigestSHA-1  1024 15   
>> 104.889 ±3.554  ops/ms
>> MessageDigests.getAndDigest  SHA-25616 15   
>> 914.541 ±   55.621  ops/ms
>> MessageDigests.getAndDigest  SHA-256  1024 15
>> 85.708 ±1.394  ops/ms
>> MessageDigests.getAndDigest  SHA-51216 15   
>> 737.719 ±   53.671  ops/ms
>> MessageDigests.getAndDigest  SHA-512  1024 15   
>> 112.307 ±1.950  ops/ms
>> 
>> GC:
>> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
>> 312.011 ±0.005B/op
>> MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15   
>> 584.020 ±0.006B/op
>> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-25616 15   
>> 544.019 ±0.016B/op
>> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-51216 15  
>> 1056.037 ±0.003B/op
>> 
>> Target:
>> Benchmark (digesterName)  (length)Cnt
>>  Score  Error   Units
>> MessageDigests.digestMD516 15  
>> 3134.462 ±   43.685  ops/ms
>> MessageDigests.digestMD5  1024 15   
>> 323.667 ±0.633  ops/ms
>> MessageDigests.digest  SHA-116 15  
>> 1418.742 ±   38.223  ops/ms
>> MessageDigests.digest  SHA-1  1024 15   
>> 110.178 ±0.788  ops/ms
>> MessageDigests.digestSHA-25616 15  
>> 1037.949 ±   21.214  ops/ms
>> MessageDigests.digestSHA-256  1024 15
>> 89.671 ±0.228  ops/ms
>> MessageDigests.digestSHA-51216 15   
>> 812.028 ±   39.489  ops/ms
>> MessageDigests.digestSHA-512  1024 15   
>> 116.738 ±0.249  ops/ms
>> MessageDigests.getAndDigest  MD516 15  
>> 2314.379 ±  229.294  ops/ms
>> MessageDigests.getAndDigest  MD5  1024 15   
>> 307.835 ±5.730  ops/ms
>> MessageDigests.getAndDigestSHA-116 15  
>> 1326.887 ±   63.263  ops/ms
>> MessageDigests.getAndDigestSHA-1  1024 15   
>> 106.611 ±2.292  ops/ms
>> MessageDigests.getAndDigest  SHA-25616 15   
>> 961.589 ±   82.052  ops/ms
>> MessageDigests.getAndDigest  SHA-256  1024 15
>> 88.646 ±0.194  ops/ms
>> MessageDigests.getAndDigest  SHA-51216 15   
>> 775.417 ±   56.775  ops/ms
>> MessageDigests.getAndDigest  SHA-512  1024 15   
>> 112.904 ±2.014  ops/ms
>> 
>> G

Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests

2021-01-15 Thread Valerie Peng
On Sun, 20 Dec 2020 20:27:03 GMT, Claes Redestad  wrote:

> - The MD5 intrinsics added by 
> [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that 
> the `int[] x` isn't actually needed. This also applies to the SHA intrinsics 
> from which the MD5 intrinsic takes inspiration
> - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to 
> make it acceptable to use inline and replace the array in MD5 wholesale. This 
> improves performance both in the presence and the absence of the intrinsic 
> optimization.
> - Doing the exact same thing in the SHA impls would be unwieldy (64+ element 
> arrays), but allocating the array lazily gets most of the speed-up in the 
> presence of an intrinsic while being neutral in its absence.
> 
> Baseline:
>   (digesterName)  (length)Cnt Score  
> Error   Units
> MessageDigests.digestMD516 15  
> 2714.307 ±   21.133  ops/ms
> MessageDigests.digestMD5  1024 15   
> 318.087 ±0.637  ops/ms
> MessageDigests.digest  SHA-116 15  
> 1387.266 ±   40.932  ops/ms
> MessageDigests.digest  SHA-1  1024 15   
> 109.273 ±0.149  ops/ms
> MessageDigests.digestSHA-25616 15   
> 995.566 ±   21.186  ops/ms
> MessageDigests.digestSHA-256  1024 15
> 89.104 ±0.079  ops/ms
> MessageDigests.digestSHA-51216 15   
> 803.030 ±   15.722  ops/ms
> MessageDigests.digestSHA-512  1024 15   
> 115.611 ±0.234  ops/ms
> MessageDigests.getAndDigest  MD516 15  
> 2190.367 ±   97.037  ops/ms
> MessageDigests.getAndDigest  MD5  1024 15   
> 302.903 ±1.809  ops/ms
> MessageDigests.getAndDigestSHA-116 15  
> 1262.656 ±   43.751  ops/ms
> MessageDigests.getAndDigestSHA-1  1024 15   
> 104.889 ±3.554  ops/ms
> MessageDigests.getAndDigest  SHA-25616 15   
> 914.541 ±   55.621  ops/ms
> MessageDigests.getAndDigest  SHA-256  1024 15
> 85.708 ±1.394  ops/ms
> MessageDigests.getAndDigest  SHA-51216 15   
> 737.719 ±   53.671  ops/ms
> MessageDigests.getAndDigest  SHA-512  1024 15   
> 112.307 ±1.950  ops/ms
> 
> GC:
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
> 312.011 ±0.005B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15   
> 584.020 ±0.006B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-25616 15   
> 544.019 ±0.016B/op
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  SHA-51216 15  
> 1056.037 ±0.003B/op
> 
> Target:
> Benchmark (digesterName)  (length)Cnt 
> Score  Error   Units
> MessageDigests.digestMD516 15  
> 3134.462 ±   43.685  ops/ms
> MessageDigests.digestMD5  1024 15   
> 323.667 ±0.633  ops/ms
> MessageDigests.digest  SHA-116 15  
> 1418.742 ±   38.223  ops/ms
> MessageDigests.digest  SHA-1  1024 15   
> 110.178 ±0.788  ops/ms
> MessageDigests.digestSHA-25616 15  
> 1037.949 ±   21.214  ops/ms
> MessageDigests.digestSHA-256  1024 15
> 89.671 ±0.228  ops/ms
> MessageDigests.digestSHA-51216 15   
> 812.028 ±   39.489  ops/ms
> MessageDigests.digestSHA-512  1024 15   
> 116.738 ±0.249  ops/ms
> MessageDigests.getAndDigest  MD516 15  
> 2314.379 ±  229.294  ops/ms
> MessageDigests.getAndDigest  MD5  1024 15   
> 307.835 ±5.730  ops/ms
> MessageDigests.getAndDigestSHA-116 15  
> 1326.887 ±   63.263  ops/ms
> MessageDigests.getAndDigestSHA-1  1024 15   
> 106.611 ±2.292  ops/ms
> MessageDigests.getAndDigest  SHA-25616 15   
> 961.589 ±   82.052  ops/ms
> MessageDigests.getAndDigest  SHA-256  1024 15
> 88.646 ±0.194  ops/ms
> MessageDigests.getAndDigest  SHA-51216 15   
> 775.417 ±   56.775  ops/ms
> MessageDigests.getAndDigest  SHA-512  1024 15   
> 112.904 ±2.014  ops/ms
> 
> GC
> MessageDigests.getAndDigest:·gc.alloc.rate.norm  MD516 15   
> 232.009 ± 

Re: RFR: 8259842: Remove Result cache from StringCoding [v2]

2021-01-15 Thread Claes Redestad
On Fri, 15 Jan 2021 21:39:00 GMT, Peter Levart  wrote:

> WDYT?

I get that the approach I took got a bit messy, but I've just spent some time 
cleaning it up. Please have a look at the latest, which outlines much of the 
logic and consolidates the replace/throw logic in the UTF8 decode paths. I've 
checked it does not regress on the micro, and I think the overall state of the 
code now isn't much messier than the original code.

-

PR: https://git.openjdk.java.net/jdk/pull/2102


Re: RFR: 8259842: Remove Result cache from StringCoding [v2]

2021-01-15 Thread Claes Redestad
> The `StringCoding.resultCached` mechanism is used to remove the allocation of 
> a `StringCoding.Result` object on potentially hot paths used in some `String` 
> constructors. Using a `ThreadLocal` has overheads though, and the overhead 
> got a bit worse after JDK-8258596 which addresses a leak by adding a 
> `SoftReference`.
> 
> This patch refactors much of the decode logic back into `String` and gets rid 
> of not only the `Result` cache, but the `Result` class itself along with the 
> `StringDecoder` class and cache.
> 
> Microbenchmark results:
> Baseline
> 
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 193.043 ±  8.207   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 164.580 ±  6.514   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 328.370 ± 18.420   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 328.870 ± 20.056   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 232.020 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 193.603 ± 12.305   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 209.454 ±  9.040   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 188.234 ±  7.533   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 399.463 ± 12.437   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.003B/op
> decodeCharsetName   MS932  avgt   15  
> 358.839 ± 15.385   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.017 ±  0.003B/op
> decodeCharsetName  ISO-8859-6  avgt   15  
> 162.570 ±  7.090   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.009 ±  0.001B/op
> decodeDefault N/A  avgt   15  
> 316.081 ± 11.101   ns/op
> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15  
> 224.019 ±  0.002B/op
> 
> Patched:
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 159.153 ±  6.082   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 134.433 ±  6.203   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 297.234 ± 21.654   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 311.583 ± 16.445   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.018 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 156.216 ±  6.522   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 180.752 ±  9.411   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 156.170 ±  8.003   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 370.337 ± 22.199   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.001B/op
> decodeCharsetName   MS932  avgt   15  
> 312.589 ±

Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Peter Levart
On Fri, 15 Jan 2021 19:14:06 GMT, Naoto Sato  wrote:

>> The `StringCoding.resultCached` mechanism is used to remove the allocation 
>> of a `StringCoding.Result` object on potentially hot paths used in some 
>> `String` constructors. Using a `ThreadLocal` has overheads though, and the 
>> overhead got a bit worse after JDK-8258596 which addresses a leak by adding 
>> a `SoftReference`.
>> 
>> This patch refactors much of the decode logic back into `String` and gets 
>> rid of not only the `Result` cache, but the `Result` class itself along with 
>> the `StringDecoder` class and cache.
>> 
>> Microbenchmark results:
>> Baseline
>> 
>> Benchmark   (charsetName)  Mode  Cnt 
>>ScoreError   Units
>> decodeCharsetUS-ASCII  avgt   15 
>>  193.043 ±  8.207   ns/op
>> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset  ISO-8859-1  avgt   15 
>>  164.580 ±  6.514   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset   UTF-8  avgt   15 
>>  328.370 ± 18.420   ns/op
>> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15 
>>  224.019 ±  0.002B/op
>> decodeCharset   MS932  avgt   15 
>>  328.870 ± 20.056   ns/op
>> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15 
>>  232.020 ±  0.002B/op
>> decodeCharset  ISO-8859-6  avgt   15 
>>  193.603 ± 12.305   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetNameUS-ASCII  avgt   15 
>>  209.454 ±  9.040   ns/op
>> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharsetName  ISO-8859-1  avgt   15 
>>  188.234 ±  7.533   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharsetName   UTF-8  avgt   15 
>>  399.463 ± 12.437   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15 
>>  224.019 ±  0.003B/op
>> decodeCharsetName   MS932  avgt   15 
>>  358.839 ± 15.385   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15 
>>  208.017 ±  0.003B/op
>> decodeCharsetName  ISO-8859-6  avgt   15 
>>  162.570 ±  7.090   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeDefault N/A  avgt   15 
>>  316.081 ± 11.101   ns/op
>> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15 
>>  224.019 ±  0.002B/op
>> 
>> Patched:
>> Benchmark   (charsetName)  Mode  Cnt 
>>ScoreError   Units
>> decodeCharsetUS-ASCII  avgt   15 
>>  159.153 ±  6.082   ns/op
>> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset  ISO-8859-1  avgt   15 
>>  134.433 ±  6.203   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset   UTF-8  avgt   15 
>>  297.234 ± 21.654   ns/op
>> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15 
>>  224.019 ±  0.002B/op
>> decodeCharset   MS932  avgt   15 
>>  311.583 ± 16.445   ns/op
>> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15 
>>  208.018 ±  0.002B/op
>> decodeCharset  ISO-8859-6  avgt   15 
>>  156.216 ±  6.522   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetNameUS-ASCII  avgt   15 
>>  180.752 ±  9.411   ns/op
>> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetName  ISO-8859-1  avgt   15 
>>  156.170 ±  8.003   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetName   UTF-8  avgt   15 
>>  370.337 ± 22.199   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm 

Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Roger Riggs
On Fri, 15 Jan 2021 20:05:17 GMT, Claes Redestad  wrote:

>> src/java.base/share/classes/java/lang/String.java line 544:
>> 
>>> 542: return;
>>> 543: }
>>> 544: if (charset == UTF_8) {
>> 
>> The constructor is getting big. Might be better to keep the original private 
>> methods (decodeASCII/Latin1/UTF8) for readability.
>
> Since we're calculating two final values, that split was what necessitated a 
> `Result` object. Until valhalla I don't think there's a way to get rid of the 
> performance cost here without putting the bulk of the logic into the 
> constructor.
> 
> Refactoring out some of the logic to utility methods could be a performance 
> neutral way to cut down the complexity, though. E.g.:
>  char c = (char)((b1 << 12) ^
> (b2 <<  6) ^
> (b3 ^
>  (((byte) 0xE0 << 12) ^
>   ((byte) 0x80 <<  6) ^
>   ((byte) 0x80 <<  0;
> if (Character.isSurrogate(c)) {
> putChar(dst, dp++, REPL);
> } else {
> putChar(dst, dp++, c);
> }
> could be reasonably factored out and reduced to something like:
> putChar(dst, dp++, StringCoding.decode3(b1, 
> b2, b3));
> I've refrained from refurbishing too much, though.

I don't think you need to inline quite so much.  Once the determination has 
been made about whether the result is Latin1 or UTF16 it calls separate methods 
anyway.  For example, after calling hasNegatives() it is known what coding is 
needed.
Then call out to a method returns the byte array.

-

PR: https://git.openjdk.java.net/jdk/pull/2102


Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Roger Riggs
On Fri, 15 Jan 2021 20:14:48 GMT, Roger Riggs  wrote:

>> Since we're calculating two final values, that split was what necessitated a 
>> `Result` object. Until valhalla I don't think there's a way to get rid of 
>> the performance cost here without putting the bulk of the logic into the 
>> constructor.
>> 
>> Refactoring out some of the logic to utility methods could be a performance 
>> neutral way to cut down the complexity, though. E.g.:
>>  char c = (char)((b1 << 12) ^
>> (b2 <<  6) ^
>> (b3 ^
>>  (((byte) 0xE0 << 12) ^
>>   ((byte) 0x80 <<  6) ^
>>   ((byte) 0x80 <<  0;
>> if (Character.isSurrogate(c)) {
>> putChar(dst, dp++, REPL);
>> } else {
>> putChar(dst, dp++, c);
>> }
>> could be reasonably factored out and reduced to something like:
>> putChar(dst, dp++, StringCoding.decode3(b1, 
>> b2, b3));
>> I've refrained from refurbishing too much, though.
>
> I don't think you need to inline quite so much.  Once the determination has 
> been made about whether the result is Latin1 or UTF16 it calls separate 
> methods anyway.  For example, after calling hasNegatives() it is known what 
> coding is needed.
> Then call out to a method returns the byte array.

(Also, every email is very long because it includes the performance data; next 
time please put the performance data in a separate comment, not the PR).

-

PR: https://git.openjdk.java.net/jdk/pull/2102


Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Claes Redestad
On Fri, 15 Jan 2021 19:11:38 GMT, Naoto Sato  wrote:

>> The `StringCoding.resultCached` mechanism is used to remove the allocation 
>> of a `StringCoding.Result` object on potentially hot paths used in some 
>> `String` constructors. Using a `ThreadLocal` has overheads though, and the 
>> overhead got a bit worse after JDK-8258596 which addresses a leak by adding 
>> a `SoftReference`.
>> 
>> This patch refactors much of the decode logic back into `String` and gets 
>> rid of not only the `Result` cache, but the `Result` class itself along with 
>> the `StringDecoder` class and cache.
>> 
>> Microbenchmark results:
>> Baseline
>> 
>> Benchmark   (charsetName)  Mode  Cnt 
>>ScoreError   Units
>> decodeCharsetUS-ASCII  avgt   15 
>>  193.043 ±  8.207   ns/op
>> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset  ISO-8859-1  avgt   15 
>>  164.580 ±  6.514   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset   UTF-8  avgt   15 
>>  328.370 ± 18.420   ns/op
>> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15 
>>  224.019 ±  0.002B/op
>> decodeCharset   MS932  avgt   15 
>>  328.870 ± 20.056   ns/op
>> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15 
>>  232.020 ±  0.002B/op
>> decodeCharset  ISO-8859-6  avgt   15 
>>  193.603 ± 12.305   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetNameUS-ASCII  avgt   15 
>>  209.454 ±  9.040   ns/op
>> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharsetName  ISO-8859-1  avgt   15 
>>  188.234 ±  7.533   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharsetName   UTF-8  avgt   15 
>>  399.463 ± 12.437   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15 
>>  224.019 ±  0.003B/op
>> decodeCharsetName   MS932  avgt   15 
>>  358.839 ± 15.385   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15 
>>  208.017 ±  0.003B/op
>> decodeCharsetName  ISO-8859-6  avgt   15 
>>  162.570 ±  7.090   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeDefault N/A  avgt   15 
>>  316.081 ± 11.101   ns/op
>> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15 
>>  224.019 ±  0.002B/op
>> 
>> Patched:
>> Benchmark   (charsetName)  Mode  Cnt 
>>ScoreError   Units
>> decodeCharsetUS-ASCII  avgt   15 
>>  159.153 ±  6.082   ns/op
>> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset  ISO-8859-1  avgt   15 
>>  134.433 ±  6.203   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.009 ±  0.001B/op
>> decodeCharset   UTF-8  avgt   15 
>>  297.234 ± 21.654   ns/op
>> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15 
>>  224.019 ±  0.002B/op
>> decodeCharset   MS932  avgt   15 
>>  311.583 ± 16.445   ns/op
>> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15 
>>  208.018 ±  0.002B/op
>> decodeCharset  ISO-8859-6  avgt   15 
>>  156.216 ±  6.522   ns/op
>> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetNameUS-ASCII  avgt   15 
>>  180.752 ±  9.411   ns/op
>> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetName  ISO-8859-1  avgt   15 
>>  156.170 ±  8.003   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15 
>>  112.010 ±  0.001B/op
>> decodeCharsetName   UTF-8  avgt   15 
>>  370.337 ± 22.199   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm 

Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Naoto Sato
On Fri, 15 Jan 2021 14:33:19 GMT, Claes Redestad  wrote:

> The `StringCoding.resultCached` mechanism is used to remove the allocation of 
> a `StringCoding.Result` object on potentially hot paths used in some `String` 
> constructors. Using a `ThreadLocal` has overheads though, and the overhead 
> got a bit worse after JDK-8258596 which addresses a leak by adding a 
> `SoftReference`.
> 
> This patch refactors much of the decode logic back into `String` and gets rid 
> of not only the `Result` cache, but the `Result` class itself along with the 
> `StringDecoder` class and cache.
> 
> Microbenchmark results:
> Baseline
> 
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 193.043 ±  8.207   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 164.580 ±  6.514   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 328.370 ± 18.420   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 328.870 ± 20.056   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 232.020 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 193.603 ± 12.305   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 209.454 ±  9.040   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 188.234 ±  7.533   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 399.463 ± 12.437   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.003B/op
> decodeCharsetName   MS932  avgt   15  
> 358.839 ± 15.385   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.017 ±  0.003B/op
> decodeCharsetName  ISO-8859-6  avgt   15  
> 162.570 ±  7.090   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.009 ±  0.001B/op
> decodeDefault N/A  avgt   15  
> 316.081 ± 11.101   ns/op
> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15  
> 224.019 ±  0.002B/op
> 
> Patched:
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 159.153 ±  6.082   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 134.433 ±  6.203   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 297.234 ± 21.654   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 311.583 ± 16.445   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.018 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 156.216 ±  6.522   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 180.752 ±  9.411   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 156.170 ±  8.003   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 370.337 ± 22.199   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.001B/op
> decodeCharsetName   

Integrated: 8259048: (tz) Upgrade time-zone data to tzdata2020f

2021-01-15 Thread Kiran Sidhartha Ravikumar
On Mon, 4 Jan 2021 18:11:05 GMT, Kiran Sidhartha Ravikumar 
 wrote:

> Hi Guys,
> 
> Updating the summary as tzdata2020f is available and includes tzdata2020e 
> changes also.
> 
> Please review the integration of tzdata2020f to JDK.
> 
> Details regarding the change can be viewed at - 
> https://mm.icann.org/pipermail/tz-announce/2020-December/64.html
> Bug: https://bugs.openjdk.java.net/browse/JDK-8259048
> 
> tzdata2020e - Most of the changes are about correcting past timestamps and 
> Australia/Currie timezone is removed.
> tzdata2020f - No changes to the data since 2020e.
> 
> Regression Tests pass along with JCK.
> 
> Please let me know if the changes are good to push.
> 
> Thanks,
> Kiran

This pull request has now been integrated.

Changeset: fe84ecd5
Author:Kiran Sidhartha Ravikumar 
URL:   https://git.openjdk.java.net/jdk/commit/fe84ecd5
Stats: 729 lines in 10 files changed: 578 ins; 19 del; 132 mod

8259048: (tz) Upgrade time-zone data to tzdata2020f

Reviewed-by: naoto, erikj

-

PR: https://git.openjdk.java.net/jdk/pull/1937


Re: RFR: 8257733: Move module-specific data from make to respective module [v4]

2021-01-15 Thread Magnus Ihse Bursie
On Fri, 15 Jan 2021 14:58:14 GMT, Alan Bateman  wrote:

>> This PR is not stale; it's just still waiting for input from @AlanBateman.
>
> @magicus Can the CharacterDataXXX.template files move to 
> src/java.base/share/classes/java/lang?

@AlanBateman That sounds like an excellent idea. I'll update the PR first thing 
next week. :)

-

PR: https://git.openjdk.java.net/jdk/pull/1611


Re: RFR: 8257733: Move module-specific data from make to respective module

2021-01-15 Thread mark . reinhold
2020/12/4 6:08:13 -0800, er...@openjdk.java.net:
> On Fri, 4 Dec 2020 12:30:02 GMT, Alan Bateman  wrote:
>>> And I can certainly move jdwp.spec to java.base instead. That's the
>>> reason I need input on this: All I know is that is definitely not
>>> the responsibility of the Build Group to maintain that document, and
>>> I made my best guess at where to place it.
>> 
>>> And I can certainly move jdwp.spec to java.base instead.
>> 
>> If jdwp.spec has to move to the src tree then src/java.se is probably
>> the most suitable home because Java SE specifies JDWP as an optional
>> interface. So nothing to do with java.base and the build will need to
>> continue to generate the sources for the front-end (jdk.jdi) and
>> back-end (jdk.jdwp.agent) as they implement the protocol.
> 
> My understanding of JEPs is that they should be viewed as living
> documents. In this case, I think it's perfectly valid to update JEP
> 201 with additional source code layout. Both for this and for the
> other missing dirs.

Feature JEPs are living documents until such time as they are delivered.
In this case it would not be appropriate to update JEP 201, which is as
much about the transition from the old source-code layout as it is about
the new layout as of 2014.

At this point, and given that we’d already gone beyond JEP 201 prior to
this change (with `man` and `lib` subdirectories), what’d make the most
sense is a new informational JEP that documents the source-code layout.
Informational JEPs can, within reason, be updated over time.

- Mark


Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Claes Redestad
On Fri, 15 Jan 2021 16:49:57 GMT, Roger Riggs  wrote:

> Interesting, I was/am in the middle of converting Result to be a Valhalla 
> primitive class to reduce the memory pressure and had written some new jmh 
> tests too.
> And to reduce the dependency on ThreadLocals.

Ok, I expect that would end up at similar performance while retaining the 
separation of concerns. But this way we're not dependent on valhalla to get rid 
of the TLs.

I'd be happy to add more JMH tests here. I expected this area to already have 
some, but it seems all the micros added during the work on compact strings work 
in JDK 9 are unaccounted for.

-

PR: https://git.openjdk.java.net/jdk/pull/2102


Integrated: 8193031: Collections.addAll is likely to perform worse than Collection.addAll

2021-01-15 Thread Сергей Цыпанов
On Mon, 14 Dec 2020 12:13:23 GMT, Сергей Цыпанов 
 wrote:

> Hello, I feel like this was previously discussed in 
> https://mail.openjdk.java.net/pipermail/core-libs-dev/ but since I cannot 
> find original mail I post this here.
> 
> Currently `Collections.addAll()` is implemented and documented as:
> /**
>  * ...
>  * The behavior of this convenience method is identical to that of
>  * {@code c.addAll(Arrays.asList(elements))}, but this method is likely
>  * to run significantly faster under most implementations.
>  */
> @SafeVarargs
> public static  boolean addAll(Collection c, T... elements) {
> boolean result = false;
> for (T element : elements)
> result |= c.add(element);
> return result;
> }
> 
> But it practice the notation `this method is likely to run significantly 
> faster under most implementations` is completely wrong. When I take this 
> [benchmark](https://github.com/stsypanov/benchmarks/blob/master/benchmark-runners/src/main/java/com/luxoft/logeek/benchmark/collection/CollectionsAddAllVsAddAllBenchmark.java)
>  and run it on JDK 14 I get the following results:
>(collection)  (size)  
> Score Error   Units
> addAllArrayList  10   
> 37.9 ± 1.9   ns/op
> addAllArrayList 100   
> 83.8 ± 3.4   ns/op
> addAllArrayList1000  
> 678.2 ±23.0   ns/op
> collectionsAddAll ArrayList  10   
> 50.9 ± 1.1   ns/op
> collectionsAddAll ArrayList 100  
> 751.4 ±47.4   ns/op
> collectionsAddAll ArrayList1000 
> 8839.8 ±   710.7   ns/op
> 
> addAll  HashSet  10  
> 128.4 ± 5.9   ns/op
> addAll  HashSet 100 
> 1864.2 ±   102.4   ns/op
> addAll  HashSet1000
> 16615.5 ±  1202.6   ns/op
> collectionsAddAll   HashSet  10  
> 172.8 ± 6.0   ns/op
> collectionsAddAll   HashSet 100 
> 2355.8 ±   195.4   ns/op
> collectionsAddAll   HashSet1000
> 20364.7 ±  1164.0   ns/op
> 
> addAll   ArrayDeque  10   
> 54.0 ± 0.4   ns/op
> addAll   ArrayDeque 100  
> 319.7 ± 2.5   ns/op
> addAll   ArrayDeque1000 
> 3176.9 ±22.2   ns/op
> collectionsAddAllArrayDeque  10   
> 66.5 ± 1.4   ns/op
> collectionsAddAllArrayDeque 100  
> 808.1 ±55.9   ns/op
> collectionsAddAllArrayDeque1000 
> 5639.6 ±   240.9   ns/op
> 
> addAll CopyOnWriteArrayList  10   
> 18.0 ± 0.7   ns/op
> addAll CopyOnWriteArrayList 100   
> 39.4 ± 1.7   ns/op
> addAll CopyOnWriteArrayList1000  
> 371.1 ±17.0   ns/op
> collectionsAddAll  CopyOnWriteArrayList  10  
> 251.9 ±18.4   ns/op
> collectionsAddAll  CopyOnWriteArrayList 100 
> 3405.9 ±   304.8   ns/op
> collectionsAddAll  CopyOnWriteArrayList1000   
> 247496.8 ± 23502.3   ns/op
> 
> addAllConcurrentLinkedDeque  10   
> 81.4 ± 2.8   ns/op
> addAllConcurrentLinkedDeque 100  
> 609.1 ±26.4   ns/op
> addAllConcurrentLinkedDeque1000 
> 4494.5 ±   219.3   ns/op
> collectionsAddAll ConcurrentLinkedDeque  10  
> 189.8 ± 2.5   ns/op
> collectionsAddAll ConcurrentLinkedDeque 100 
> 1660.0 ±62.0   ns/op
> collectionsAddAll ConcurrentLinkedDeque1000
> 17649.2 ±   300.9   ns/op
> 
> addAll:·gc.alloc.rate.normArrayList  10  
> 160.0 ± 0.0B/op
> addAll:·gc.alloc.rate.normArrayList 100  
> 880.0 ± 0.0B/op
> addAll:·gc.alloc.rate.normArrayList1000 
> 8080.3 ± 0.0B/op
> collectionsAddAll:·gc.alloc.rate.norm ArrayList  10   
> 80.0 ± 0.0B/op
> collectionsAddAll:·gc.alloc.rate.norm ArrayList 100 
> 1400.2 ± 0.0B/op
>

Re: RFR: 8193031: Collections.addAll is likely to perform worse than Collection.addAll [v7]

2021-01-15 Thread Peter Levart
On Thu, 14 Jan 2021 22:38:53 GMT, Peter Levart  wrote:

>> Сергей Цыпанов has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains 10 additional 
>> commits since the last revision:
>> 
>>  - Merge branch 'master' into add-all
>>  - 8193031: fix JavaDoc
>>  - Merge branch 'master' into add-all
>>  - Merge branch 'master' into add-all
>>  - 8193031: revert implementation change but keep one for JavaDoc
>>  - Merge branch 'master' into add-all
>>  - Merge remote-tracking branch 'origin/add-all' into add-all
>>  - 8193031: add elements in bulk in Collections.addAll()
>>  - Merge branch 'master' into add-all
>>  - 8193031: add elements in bulk in Collections.addAll()
>
> I think this looks good.

Hearing no objections, I'll sponsor this for Sergei.

-

PR: https://git.openjdk.java.net/jdk/pull/1764


Re: RFR: 8259048: (tz) Upgrade time-zone data to tzdata2020f [v2]

2021-01-15 Thread Naoto Sato
On Fri, 15 Jan 2021 10:24:24 GMT, Kiran Sidhartha Ravikumar 
 wrote:

>> Hi Guys,
>> 
>> Updating the summary as tzdata2020f is available and includes tzdata2020e 
>> changes also.
>> 
>> Please review the integration of tzdata2020f to JDK.
>> 
>> Details regarding the change can be viewed at - 
>> https://mm.icann.org/pipermail/tz-announce/2020-December/64.html
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8259048
>> 
>> tzdata2020e - Most of the changes are about correcting past timestamps and 
>> Australia/Currie timezone is removed.
>> tzdata2020f - No changes to the data since 2020e.
>> 
>> Regression Tests pass along with JCK.
>> 
>> Please let me know if the changes are good to push.
>> 
>> Thanks,
>> Kiran
>
> Kiran Sidhartha Ravikumar has updated the pull request with a new target base 
> due to a merge or a rebase. The incremental webrev excludes the unrelated 
> changes brought in by the merge/rebase. The pull request contains three 
> additional commits since the last revision:
> 
>  - 8258878: (tz) Upgrade time-zone data to tzdata2020e
>  - Merge remote-tracking branch 'origin/master' into JDK-8258878
>  - 8258878: (tz) Upgrade time-zone data to tzdata2020e

Marked as reviewed by naoto (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/1937


Re: RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Roger Riggs
On Fri, 15 Jan 2021 14:33:19 GMT, Claes Redestad  wrote:

> The `StringCoding.resultCached` mechanism is used to remove the allocation of 
> a `StringCoding.Result` object on potentially hot paths used in some `String` 
> constructors. Using a `ThreadLocal` has overheads though, and the overhead 
> got a bit worse after JDK-8258596 which addresses a leak by adding a 
> `SoftReference`.
> 
> This patch refactors much of the decode logic back into `String` and gets rid 
> of not only the `Result` cache, but the `Result` class itself along with the 
> `StringDecoder` class and cache.
> 
> Microbenchmark results:
> Baseline
> 
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 193.043 ±  8.207   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 164.580 ±  6.514   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 328.370 ± 18.420   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 328.870 ± 20.056   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 232.020 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 193.603 ± 12.305   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 209.454 ±  9.040   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 188.234 ±  7.533   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 399.463 ± 12.437   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.003B/op
> decodeCharsetName   MS932  avgt   15  
> 358.839 ± 15.385   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.017 ±  0.003B/op
> decodeCharsetName  ISO-8859-6  avgt   15  
> 162.570 ±  7.090   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.009 ±  0.001B/op
> decodeDefault N/A  avgt   15  
> 316.081 ± 11.101   ns/op
> decodeDefault:·gc.alloc.rate.norm N/A  avgt   15  
> 224.019 ±  0.002B/op
> 
> Patched:
> Benchmark   (charsetName)  Mode  Cnt  
>   ScoreError   Units
> decodeCharsetUS-ASCII  avgt   15  
> 159.153 ±  6.082   ns/op
> decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset  ISO-8859-1  avgt   15  
> 134.433 ±  6.203   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.009 ±  0.001B/op
> decodeCharset   UTF-8  avgt   15  
> 297.234 ± 21.654   ns/op
> decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.002B/op
> decodeCharset   MS932  avgt   15  
> 311.583 ± 16.445   ns/op
> decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
> 208.018 ±  0.002B/op
> decodeCharset  ISO-8859-6  avgt   15  
> 156.216 ±  6.522   ns/op
> decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetNameUS-ASCII  avgt   15  
> 180.752 ±  9.411   ns/op
> decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName  ISO-8859-1  avgt   15  
> 156.170 ±  8.003   ns/op
> decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
> 112.010 ±  0.001B/op
> decodeCharsetName   UTF-8  avgt   15  
> 370.337 ± 22.199   ns/op
> decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
> 224.019 ±  0.001B/op
> decodeCharsetName   

Re: RFR: 6594730: UUID.getVersion() is only meaningful for Leach-Salz variant

2021-01-15 Thread some-java-user-99206970363698485155
>   1. Replace ` ` with a normal space, that should work as well and is easier 
> to read
Looks like my e-mail client was so kind and replaced the HTML character 
reference. It should have said:
"Replace `& nbsp ;` with a normal space, ..."

Additionally, if you want to search for projects using UUID.version() you can 
use the following
CodeQL query:
https://lgtm.com/query/283083268427438766/

You can (in addition to the example projects), specify custom projects to scan 
as well, see
https://lgtm.com/help/lgtm/project-lists.

Kind regards


Re: Monitoring wrapped ThreadPoolExecutor returned from Executors

2021-01-15 Thread Alan Bateman

On 15/01/2021 10:51, Tommy Ludwig wrote:

Parsing the delegate toString would be a usable workaround for the time being 
for Micrometer, assuming the format is stable across JDK versions. It's better 
than no alternative, which is where I think we are currently at for the wrapped 
cases.
As a temporary workaround you can use --add-opens to open 
java.base/java.util.concurrent. That will keep existing hacks working 
until you have something better.


Given the importance of specific thread pools in some applications then 
there may be merit exploring a management interface 
(PlatformManagedObject) that would exposes some stats. This could allow 
both local and remote tooling to monitor/sample the stats of interesting 
thread pools.


-Alan



Re: RFR: 6594730: UUID.getVersion() is only meaningful for Leach-Salz variant

2021-01-15 Thread some-java-user-99206970363698485155
The following probably does not matter much because I am not an OpenJDK 
contributor, but personally
I think throwing an UnsupportedOperationException is reasonable:
  1. It is consistent with the other methods which also only work for one 
specific variant
  2. Code which calls UUID.version() on a non-variant 2 UUID is obviously 
already functionally broken;
 I don't think maintaining backward compatibility here adds any value

Regarding the pull request, I would recommend the following changes:
  1. Replace ` ` with a normal space, that should work as well and is 
easier to read
  2. Add a sentence to the method description (and not only to the `@throws` 
tag), that this method
 is only meaningful for variant 2 UUIDs, see for example the documentation 
for `timestamp()` for
 how this sentence should look like to be consistent:
 
https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/UUID.html#timestamp()

Kind regards


RFR: 8259842: Remove Result cache from StringCoding

2021-01-15 Thread Claes Redestad
The `StringCoding.resultCached` mechanism is used to remove the allocation of a 
`StringCoding.Result` object on potentially hot paths used in some `String` 
constructors. Using a `ThreadLocal` has overheads though, and the overhead got 
a bit worse after JDK-8258596 which addresses a leak by adding a 
`SoftReference`.

This patch refactors much of the decode logic back into `String` and gets rid 
of not only the `Result` cache, but the `Result` class itself along with the 
`StringDecoder` class and cache.

Microbenchmark results:
Baseline

Benchmark   (charsetName)  Mode  Cnt
ScoreError   Units
decodeCharsetUS-ASCII  avgt   15  
193.043 ±  8.207   ns/op
decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
112.009 ±  0.001B/op
decodeCharset  ISO-8859-1  avgt   15  
164.580 ±  6.514   ns/op
decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
112.009 ±  0.001B/op
decodeCharset   UTF-8  avgt   15  
328.370 ± 18.420   ns/op
decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
224.019 ±  0.002B/op
decodeCharset   MS932  avgt   15  
328.870 ± 20.056   ns/op
decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
232.020 ±  0.002B/op
decodeCharset  ISO-8859-6  avgt   15  
193.603 ± 12.305   ns/op
decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
112.010 ±  0.001B/op
decodeCharsetNameUS-ASCII  avgt   15  
209.454 ±  9.040   ns/op
decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
112.009 ±  0.001B/op
decodeCharsetName  ISO-8859-1  avgt   15  
188.234 ±  7.533   ns/op
decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
112.009 ±  0.001B/op
decodeCharsetName   UTF-8  avgt   15  
399.463 ± 12.437   ns/op
decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
224.019 ±  0.003B/op
decodeCharsetName   MS932  avgt   15  
358.839 ± 15.385   ns/op
decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
208.017 ±  0.003B/op
decodeCharsetName  ISO-8859-6  avgt   15  
162.570 ±  7.090   ns/op
decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
112.009 ±  0.001B/op
decodeDefault N/A  avgt   15  
316.081 ± 11.101   ns/op
decodeDefault:·gc.alloc.rate.norm N/A  avgt   15  
224.019 ±  0.002B/op

Patched:
Benchmark   (charsetName)  Mode  Cnt
ScoreError   Units
decodeCharsetUS-ASCII  avgt   15  
159.153 ±  6.082   ns/op
decodeCharset:·gc.alloc.rate.normUS-ASCII  avgt   15  
112.009 ±  0.001B/op
decodeCharset  ISO-8859-1  avgt   15  
134.433 ±  6.203   ns/op
decodeCharset:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
112.009 ±  0.001B/op
decodeCharset   UTF-8  avgt   15  
297.234 ± 21.654   ns/op
decodeCharset:·gc.alloc.rate.norm   UTF-8  avgt   15  
224.019 ±  0.002B/op
decodeCharset   MS932  avgt   15  
311.583 ± 16.445   ns/op
decodeCharset:·gc.alloc.rate.norm   MS932  avgt   15  
208.018 ±  0.002B/op
decodeCharset  ISO-8859-6  avgt   15  
156.216 ±  6.522   ns/op
decodeCharset:·gc.alloc.rate.norm  ISO-8859-6  avgt   15  
112.010 ±  0.001B/op
decodeCharsetNameUS-ASCII  avgt   15  
180.752 ±  9.411   ns/op
decodeCharsetName:·gc.alloc.rate.normUS-ASCII  avgt   15  
112.010 ±  0.001B/op
decodeCharsetName  ISO-8859-1  avgt   15  
156.170 ±  8.003   ns/op
decodeCharsetName:·gc.alloc.rate.norm  ISO-8859-1  avgt   15  
112.010 ±  0.001B/op
decodeCharsetName   UTF-8  avgt   15  
370.337 ± 22.199   ns/op
decodeCharsetName:·gc.alloc.rate.norm   UTF-8  avgt   15  
224.019 ±  0.001B/op
decodeCharsetName   MS932  avgt   15  
312.589 ± 15.067   ns/op
decodeCharsetName:·gc.alloc.rate.norm   MS932  avgt   15  
208.018 ±  0.002B/op
decodeCharsetName  ISO-8859-6  avgt   

Re: RFR: 8257733: Move module-specific data from make to respective module [v4]

2021-01-15 Thread Alan Bateman
On Mon, 11 Jan 2021 09:20:07 GMT, Magnus Ihse Bursie  wrote:

>> Marked as reviewed by prr (Reviewer).
>
> This PR is not stale; it's just still waiting for input from @AlanBateman.

@magicus Can the CharacterDataXXX.template files move to 
src/java.base/share/classes/java/lang?

-

PR: https://git.openjdk.java.net/jdk/pull/1611


Re: Monitoring wrapped ThreadPoolExecutor returned from Executors

2021-01-15 Thread Tommy Ludwig
Parsing the delegate toString would be a usable workaround for the time being 
for Micrometer, assuming the format is stable across JDK versions. It's better 
than no alternative, which is where I think we are currently at for the wrapped 
cases.

On 2021/01/09 1:24, "Doug Lea"  wrote:

On 1/7/21 12:57 PM, Jason Mehrens wrote:
> Hi Doug,
>
> What are your thoughts on promoting monitoring methods from TPE and or 
FJP to AbstractExecutorService?  The default implementations could just return 
-1.  An example precedent is OperatingSystemMXBean::getSystemLoadAverage.  The 
Executors.DelegatedExecutorService could then be modified to extend 
AbstractExecutorService and forward the new methods and the existing 
AES::taskFor calls when the wrapped Executor is also an 
AbstractExecutorService.  The return types of the Executors.newXXX would remain 
the same.

Maybe. But for now, here's a cheap trick that might be tolerable: Add to 
DelegatedExecutorService:

public String toString() { return e.toString(); }

The juc executors (ThreadPoolExecutor and ForkJoinPool that could be 
wrapped here) both print pool size, active threads, queued tasks, and 
completed tasks. It would require not-very-pleasant string parsing in 
monitoring tools, but this might be good enough for Micrometer and others?


>
> I suppose the tradeoff is that adding any new default method to 
ExecutorService and or new methods to AbstractExecutorService could break 3rd 
party code.
>
> Jason
>
> 
> From: core-libs-dev  on behalf of 
Doug Lea 
> Sent: Thursday, January 7, 2021 7:09 AM
> To: core-libs-dev@openjdk.java.net
> Subject: Re: Monitoring wrapped ThreadPoolExecutor returned from Executors
>
> On 1/5/21 10:11 PM, Tommy Ludwig wrote:
>> In the Micrometer project, we provide metrics instrumentation of 
`ExectorService`s. For `ThreadPoolExecutor`s, we track the number of completed 
tasks, active tasks, thread pool sizes, task queue size and remaining capacity 
via methods from `ThreadPoolExecutor`. We are currently using a brittle 
reflection hack[1] to do this for the wrapped `ThreadPoolExecutor` returned 
from `Executors` methods `newSingleThreadExecutor` and 
`newSingleThreadScheduledExecutor`. With the introduction of JEP-396 in JDK 16, 
our reflection hack throws an InaccessibleObjectException by default.
>>
>> I am not seeing a proper way to get at the methods we use for the 
metrics (e.g. `ThreadPoolExecutor::getCompletedTaskCount`) in this case. Is 
there a way that I am missing?
> There's no guarantee that newSingleThreadExecutor returns a restricted
> view of a ThreadPoolExecutor, so there can't be a guaranteed way of
> accessing it,
>
> But I'm sympathetic to the idea that under the current implementation
> (which is unlikely to change anytime soon), the stats are available, and
> should be available to monitoring tools. But none of the ways to do this
> are very attractive: Creating a MonitorableExecutorService interface and
> returning that? Making the internal view class public with a protected
> getExecutor method?
>
>




Re: RFR: 8259048: (tz) Upgrade time-zone data to tzdata2020f [v2]

2021-01-15 Thread Kiran Sidhartha Ravikumar
On Mon, 4 Jan 2021 18:40:06 GMT, Naoto Sato  wrote:

> Looks good.
> IIUC, 2020f is already out, and the 2020e-2020f diff does not seem to include 
> any data change. Would you change this PR to incorporate 2020f?

Hi @naotoj ,

I have updated the VERSION and PR title to incorporate tzdata2020f, please let 
me know if the changes are good to push.

-

PR: https://git.openjdk.java.net/jdk/pull/1937


Re: RFR: 8259048: (tz) Upgrade time-zone data to tzdata2020f [v2]

2021-01-15 Thread Kiran Sidhartha Ravikumar
> Hi Guys,
> 
> Please review the integration of tzdata2020e to JDK.
> 
> Details regarding the change can be viewed at - 
> https://mm.icann.org/pipermail/tz-announce/2020-December/63.html
> Bug: https://bugs.openjdk.java.net/browse/JDK-8258878
> 
> Most of the changes are about correcting past timestamps and Australia/Currie 
> timezone is removed.
> 
> Regression Tests pass along with JCK.
> 
> Please let me know if the changes are good to push.
> 
> Thanks,
> Kiran

Kiran Sidhartha Ravikumar has updated the pull request with a new target base 
due to a merge or a rebase. The incremental webrev excludes the unrelated 
changes brought in by the merge/rebase. The pull request contains three 
additional commits since the last revision:

 - 8258878: (tz) Upgrade time-zone data to tzdata2020e
 - Merge remote-tracking branch 'origin/master' into JDK-8258878
 - 8258878: (tz) Upgrade time-zone data to tzdata2020e

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1937/files
  - new: https://git.openjdk.java.net/jdk/pull/1937/files/a89ac891..4e18e930

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1937&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1937&range=00-01

  Stats: 46333 lines in 1566 files changed: 19748 ins; 13335 del; 13250 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1937.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1937/head:pull/1937

PR: https://git.openjdk.java.net/jdk/pull/1937