Re: LinearProbeHashtable Re: ClassValue perf?

Peter Levart Fri, 27 May 2016 03:42:07 -0700

Hi Paul,


On 05/26/2016 01:20 PM, Paul Sandoz wrote:

Hi Peter,

Opportunistically if your LinearProbeHashtable works out then i am wondering if 
we could replace the use of CHM within MethodType.ConcurrentWeakInternSet, 
which only uses get/putIfAbsent/remove.

Thereby CHM can use VarHandles without inducing a circular dependency.

Paul.

It could be used, yes. LinearProbeHashtable is not scalable to multiplethreads like CHM is for modifications as it is using a single lock forall modification operations including rehashing, but it is lock-free forlookups, so for usecases such as caching, where lookups dominate andmodifications are mostly performed in batches from single thread (whensome subsystem initializes), it could be a viable alternative to CHM.

If it is moved to some jdk.internal subpackage and made public, I couldadd missing Map methods, mostly to be able to include it in MOAT tests.


What do you think?

Regards, Peter

On 26 May 2016, at 10:59, Peter Levart <[email protected]> wrote:

Hi Michael,

On 05/23/2016 03:56 PM, Michael Haupt wrote:

I've ran the unpatched version and Peter's two patches once more. The results 
are attached (results.txt). They confirm Aleksey's observation.

Regarding the 03 patch (plevart3 column in the results), perfasm output (see 
http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip) suggests the cost is 
mainly accrued in ConcurrentHashMap. The same is the case for the 02 patch 
(plevart2 column).

As things stand, I think we can even focus on Peter's 02 patch, as this is the 
faster of his two proposals (plevart2 column in the results), reduces the 
footprint, and reduces the implementation complexity. Can anything be done to 
improve on its performance? (It has slight performance slowdowns for the 
single-value case as well.)

I can't think of anything else short of improving performance of CHM itself.

Or replacing CHM with a "better" implementation:

     http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/

This webrev is similar to webrev.02. It's only difference is in ClassValueMap 
which extends LinearProbeHashtable instead of ConcurrentHashMap. 
LinearProbeHashtable is a simple implementation of a linear-probe hash table. 
It's not a full Map implementation. It only implements methods needed in 
ClassValue. With this implementation I get a slight boost compared to JDK 9 
ClassValue implementation for all sizes and counts:

Benchmark                         (classCount)  (classValueCount)  (impl)  Mode 
 Cnt   Score   Error  Units
ClassValueBench.randomAccess               128                  1    jdk9  avgt 
  10   9.079 ± 0.092  ns/op
ClassValueBench.randomAccess               128                  4    jdk9  avgt 
  10  10.615 ± 0.102  ns/op
ClassValueBench.randomAccess               128                 16    jdk9  avgt 
  10  11.665 ± 0.012  ns/op
ClassValueBench.randomAccess               128                256    jdk9  avgt 
  10  19.151 ± 0.219  ns/op
ClassValueBench.randomAccess              1024                  1    jdk9  avgt 
  10  14.642 ± 0.425  ns/op
ClassValueBench.randomAccess              1024                  4    jdk9  avgt 
  10  22.577 ± 0.093  ns/op
ClassValueBench.randomAccess              1024                 16    jdk9  avgt 
  10  19.864 ± 0.736  ns/op
ClassValueBench.randomAccess              1024                256    jdk9  avgt 
  10  60.470 ± 0.285  ns/op
ClassValueBench.sequentialAccess           128                  1    jdk9  avgt 
  10   9.741 ± 0.033  ns/op
ClassValueBench.sequentialAccess           128                  4    jdk9  avgt 
  10   8.252 ± 0.029  ns/op
ClassValueBench.sequentialAccess           128                 16    jdk9  avgt 
  10   7.888 ± 1.249  ns/op
ClassValueBench.sequentialAccess           128                256    jdk9  avgt 
  10  16.493 ± 0.415  ns/op
ClassValueBench.sequentialAccess          1024                  1    jdk9  avgt 
  10  13.376 ± 0.452  ns/op
ClassValueBench.sequentialAccess          1024                  4    jdk9  avgt 
  10  10.023 ± 0.020  ns/op
ClassValueBench.sequentialAccess          1024                 16    jdk9  avgt 
  10   8.029 ± 0.178  ns/op
ClassValueBench.sequentialAccess          1024                256    jdk9  avgt 
  10  33.472 ± 0.058  ns/op

Benchmark                         (classCount)  (classValueCount)  (impl)  Mode 
 Cnt   Score   Error  Units
ClassValueBench.randomAccess               128                  1    pl04  avgt 
  10   8.955 ± 0.055  ns/op
ClassValueBench.randomAccess               128                  4    pl04  avgt 
  10   9.999 ± 0.017  ns/op
ClassValueBench.randomAccess               128                 16    pl04  avgt 
  10  11.615 ± 1.928  ns/op
ClassValueBench.randomAccess               128                256    pl04  avgt 
  10  17.063 ± 0.460  ns/op
ClassValueBench.randomAccess              1024                  1    pl04  avgt 
  10  12.553 ± 0.086  ns/op
ClassValueBench.randomAccess              1024                  4    pl04  avgt 
  10  16.766 ± 0.221  ns/op
ClassValueBench.randomAccess              1024                 16    pl04  avgt 
  10  18.496 ± 0.051  ns/op
ClassValueBench.randomAccess              1024                256    pl04  avgt 
  10  41.390 ± 0.321  ns/op
ClassValueBench.sequentialAccess           128                  1    pl04  avgt 
  10   7.854 ± 0.381  ns/op
ClassValueBench.sequentialAccess           128                  4    pl04  avgt 
  10   7.498 ± 0.055  ns/op
ClassValueBench.sequentialAccess           128                 16    pl04  avgt 
  10   9.218 ± 1.000  ns/op
ClassValueBench.sequentialAccess           128                256    pl04  avgt 
  10  13.593 ± 0.275  ns/op
ClassValueBench.sequentialAccess          1024                  1    pl04  avgt 
  10   8.774 ± 0.037  ns/op
ClassValueBench.sequentialAccess          1024                  4    pl04  avgt 
  10   8.562 ± 0.014  ns/op
ClassValueBench.sequentialAccess          1024                 16    pl04  avgt 
  10   7.596 ± 0.027  ns/op
ClassValueBench.sequentialAccess          1024                256    pl04  avgt 
  10  24.605 ± 0.143  ns/op


The footprint is even better than with CHM as LPHT does not maintain internal 
Map.Entry(s).

How does this implementation compare on your hardware, Michael?

Regards, Peter

_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev



_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Re: LinearProbeHashtable Re: ClassValue perf?

Reply via email to