Hi Paul,

On 05/26/2016 01:20 PM, Paul Sandoz wrote:
Hi Peter,

Opportunistically if your LinearProbeHashtable works out then i am wondering if 
we could replace the use of CHM within MethodType.ConcurrentWeakInternSet, 
which only uses get/putIfAbsent/remove.

Thereby CHM can use VarHandles without inducing a circular dependency.

Paul.

It could be used, yes. LinearProbeHashtable is not scalable to multiple threads like CHM is for modifications as it is using a single lock for all modification operations including rehashing, but it is lock-free for lookups, so for usecases such as caching, where lookups dominate and modifications are mostly performed in batches from single thread (when some subsystem initializes), it could be a viable alternative to CHM.

If it is moved to some jdk.internal subpackage and made public, I could add missing Map methods, mostly to be able to include it in MOAT tests.

What do you think?

Regards, Peter


On 26 May 2016, at 10:59, Peter Levart <peter.lev...@gmail.com> wrote:

Hi Michael,

On 05/23/2016 03:56 PM, Michael Haupt wrote:
I've ran the unpatched version and Peter's two patches once more. The results 
are attached (results.txt). They confirm Aleksey's observation.

Regarding the 03 patch (plevart3 column in the results), perfasm output (see 
http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip) suggests the cost is 
mainly accrued in ConcurrentHashMap. The same is the case for the 02 patch 
(plevart2 column).

As things stand, I think we can even focus on Peter's 02 patch, as this is the 
faster of his two proposals (plevart2 column in the results), reduces the 
footprint, and reduces the implementation complexity. Can anything be done to 
improve on its performance? (It has slight performance slowdowns for the 
single-value case as well.)
I can't think of anything else short of improving performance of CHM itself.

Or replacing CHM with a "better" implementation:

     http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/

This webrev is similar to webrev.02. It's only difference is in ClassValueMap 
which extends LinearProbeHashtable instead of ConcurrentHashMap. 
LinearProbeHashtable is a simple implementation of a linear-probe hash table. 
It's not a full Map implementation. It only implements methods needed in 
ClassValue. With this implementation I get a slight boost compared to JDK 9 
ClassValue implementation for all sizes and counts:

Benchmark                         (classCount)  (classValueCount)  (impl)  Mode 
 Cnt   Score   Error  Units
ClassValueBench.randomAccess               128                  1    jdk9  avgt 
  10   9.079 ± 0.092  ns/op
ClassValueBench.randomAccess               128                  4    jdk9  avgt 
  10  10.615 ± 0.102  ns/op
ClassValueBench.randomAccess               128                 16    jdk9  avgt 
  10  11.665 ± 0.012  ns/op
ClassValueBench.randomAccess               128                256    jdk9  avgt 
  10  19.151 ± 0.219  ns/op
ClassValueBench.randomAccess              1024                  1    jdk9  avgt 
  10  14.642 ± 0.425  ns/op
ClassValueBench.randomAccess              1024                  4    jdk9  avgt 
  10  22.577 ± 0.093  ns/op
ClassValueBench.randomAccess              1024                 16    jdk9  avgt 
  10  19.864 ± 0.736  ns/op
ClassValueBench.randomAccess              1024                256    jdk9  avgt 
  10  60.470 ± 0.285  ns/op
ClassValueBench.sequentialAccess           128                  1    jdk9  avgt 
  10   9.741 ± 0.033  ns/op
ClassValueBench.sequentialAccess           128                  4    jdk9  avgt 
  10   8.252 ± 0.029  ns/op
ClassValueBench.sequentialAccess           128                 16    jdk9  avgt 
  10   7.888 ± 1.249  ns/op
ClassValueBench.sequentialAccess           128                256    jdk9  avgt 
  10  16.493 ± 0.415  ns/op
ClassValueBench.sequentialAccess          1024                  1    jdk9  avgt 
  10  13.376 ± 0.452  ns/op
ClassValueBench.sequentialAccess          1024                  4    jdk9  avgt 
  10  10.023 ± 0.020  ns/op
ClassValueBench.sequentialAccess          1024                 16    jdk9  avgt 
  10   8.029 ± 0.178  ns/op
ClassValueBench.sequentialAccess          1024                256    jdk9  avgt 
  10  33.472 ± 0.058  ns/op

Benchmark                         (classCount)  (classValueCount)  (impl)  Mode 
 Cnt   Score   Error  Units
ClassValueBench.randomAccess               128                  1    pl04  avgt 
  10   8.955 ± 0.055  ns/op
ClassValueBench.randomAccess               128                  4    pl04  avgt 
  10   9.999 ± 0.017  ns/op
ClassValueBench.randomAccess               128                 16    pl04  avgt 
  10  11.615 ± 1.928  ns/op
ClassValueBench.randomAccess               128                256    pl04  avgt 
  10  17.063 ± 0.460  ns/op
ClassValueBench.randomAccess              1024                  1    pl04  avgt 
  10  12.553 ± 0.086  ns/op
ClassValueBench.randomAccess              1024                  4    pl04  avgt 
  10  16.766 ± 0.221  ns/op
ClassValueBench.randomAccess              1024                 16    pl04  avgt 
  10  18.496 ± 0.051  ns/op
ClassValueBench.randomAccess              1024                256    pl04  avgt 
  10  41.390 ± 0.321  ns/op
ClassValueBench.sequentialAccess           128                  1    pl04  avgt 
  10   7.854 ± 0.381  ns/op
ClassValueBench.sequentialAccess           128                  4    pl04  avgt 
  10   7.498 ± 0.055  ns/op
ClassValueBench.sequentialAccess           128                 16    pl04  avgt 
  10   9.218 ± 1.000  ns/op
ClassValueBench.sequentialAccess           128                256    pl04  avgt 
  10  13.593 ± 0.275  ns/op
ClassValueBench.sequentialAccess          1024                  1    pl04  avgt 
  10   8.774 ± 0.037  ns/op
ClassValueBench.sequentialAccess          1024                  4    pl04  avgt 
  10   8.562 ± 0.014  ns/op
ClassValueBench.sequentialAccess          1024                 16    pl04  avgt 
  10   7.596 ± 0.027  ns/op
ClassValueBench.sequentialAccess          1024                256    pl04  avgt 
  10  24.605 ± 0.143  ns/op


The footprint is even better than with CHM as LPHT does not maintain internal 
Map.Entry(s).

How does this implementation compare on your hardware, Michael?

Regards, Peter

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Reply via email to