On Fri, 12 Mar 2021 05:10:25 GMT, Sergey Bylokhov <s...@openjdk.org> wrote:

> FYI: probably is better/simpler to review it via webrev.
> 
> After migration to the lcms from the kcms the performance of some operations 
> was regressed. One possible workaround was to split the operation into 
> multiple threads. But unfortunately, we have too many bottlenecks which 
> prevent using multithreading. This is the request to remove/minimize such 
> bottlenecks(at least some of them), but it does not affect the 
> single-threaded performance it should be handled separately.
> 
> The main code pattern optimized here is this:
>     activate();
>     byte[] theHeader = getData(cmmProfile, icSigHead);
>     ---->  CMSManager.getModule().getTagData(p, tagSignature);
> Notes about the code above:
> 
> 1. Before the change "activate()" method checked that the "cmmProfile" field 
> was not null. After that we usually used the "cmmProfile" as the parameter to 
> some other method. This included two volatile reads, and also required to 
> check when we need to call the "activate()" method before usage of the 
> "cmmProfile" field.
> Solution: "activate()" renamed to the "cmmProfile()" which became an accessor 
> for the field, so we will get one volatile read and can easily monitor the 
> usage of the field itself(it is used directly only in this method).
> 
> 2. The synchronized static method "CMSManager.getModule()" reimplemented to 
> have only one volatile read.
> 
> 3. The usage of locking in the "getTagData()" is changed. Instead of the 
> synchronized instance methods, we now use the mix of "ConcurrentHashMap" and 
> StampedLock.
> 
> See some comments inline.
> 
> Some numbers(small numbers are better):
> 
> 1. Performance of ((ICC_ProfileRGB) 
> ICC_Profile.getInstance(ColorSpace.CS_sRGB)).getMatrix();
> 
> jdk 15.0.2
> Benchmark                              Mode  Cnt    Score      Error  Units
> CMMPerf.ThreadsMAX.testGetMatrix       avgt    5   19,624 ±    0,059  us/op
> CMMPerf.testGetMatrix                  avgt    5    0,154 ±    0,001  us/op
> 
> jdk - before the fix
> Benchmark                              Mode  Cnt    Score      Error  Units
> CMMPerf.ThreadsMAX.testGetMatrix       avgt    5   12,935 ±    0,042  us/op
> CMMPerf.testGetMatrix                  avgt    5    0,127 ±    0,007  us/op
> 
> jdk - after the fix
> Benchmark                              Mode  Cnt    Score      Error  Units
> CMMPerf.ThreadsMAX.testGetMatrix       avgt    5    0,561 ±    0,005  us/op
> CMMPerf.testGetMatrix                  avgt    5    0,092 ±    0,001  us/op
> 
> 2. Part of performance gain in jdk17 is from some other fixes, for example
>     Performance of ICC_Profile.getInstance(ColorSpace.CS_sRGB); and 
> ColorSpace.getInstance(ColorSpace.CS_sRGB);
> 
> jdk 15.0.2
> Benchmark                              Mode  Cnt    Score      Error  Units
> CMMPerf.ThreadsMAX.testGetSRGBProfile  avgt    5    2,299 ±    0,032  us/op
> CMMPerf.ThreadsMAX.testGetSRGBSpace    avgt    5    2,210 ±    0,051  us/op
> CMMPerf.testGetSRGBProfile             avgt    5    0,019 ±    0,001  us/op
> CMMPerf.testGetSRGBSpace               avgt    5    0,018 ±    0,001  us/op
> 
> jdk - same before/ after the fix
> Benchmark                              Mode  Cnt    Score      Error  Units
> CMMPerf.ThreadsMAX.testGetSRGBProfile  avgt    5    0,005 ±    0,001  us/op
> CMMPerf.ThreadsMAX.testGetSRGBSpace    avgt    5    0,005 ±    0,001  us/op
> CMMPerf.testGetSRGBProfile             avgt    5    0,005 ±    0,001  us/op
> CMMPerf.testGetSRGBSpace               avgt    5    0,005 ±    0,001  us/op
> 
> note "ThreadsMAX" is 32 threads.

Marked as reviewed by azvegint (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2957

Reply via email to