GitHub user berg223 edited a discussion: Should we continue to use 
ConcurrentLongHashMap in source code?

I have write a 
[benchmark](https://github.com/berg223/BenchMark/blob/main/src/main/java/apache/pulsar/map/BenchMarkRunner.java)
 for ConcurrentLongHashMap and ConcurrentHashMap. And the result shows 
ConcurrentHashMap has at least 1.6x throughput than ConcurrentLongHashMap. 
However,  ConcurrentHashMap do have more gc count. **Do you think it's a 
tradeoff between throughput and gc count?**

I have upload the benchmark result in attachment file 
[benchmark-concurrent-hashmap.json](https://github.com/user-attachments/files/25598801/benchmark-concurrent-hashmap.json).
 You can visialize the result in [JMH Visualizer](https://jmh.morethan.io/). 
And summary the result of throughput benchmark in the following table:

<table>
<tr><td>method</td>  <td>ConcurrentHashMap</td>         
<td>ConcurrentLongHashMap</td> </tr>
<tr><td>getAbsent</td>  <td>about 500,000,000 ops/s</td>         <td>about 
300,000,000 ops/s</td> </tr>
<tr><td>getPresent</td>  <td>about 500,000,000 ops/s</td>         <td>about 
300,000,000 ops/s</td> </tr>
<tr><td>putAbsent</td>  <td>about 14,000,000 ops/s</td>         <td> about 
5,000,000 ops/s</td> </tr>
<tr><td>putPresent</td>  <td>about 80,000,000 ops/s</td>         <td> about 
30,000,000 ops/s</td> </tr>
<tr><td>removeAbsent</td>  <td>about 220,000,000 ops/s</td>         <td> about 
33,000,000 ops/s</td> </tr>
<tr><td>removePresent</td>  <td>about 200,000,000 ops/s</td>         <td> about 
30,000,000 ops/s</td> </tr>
</table>

**My question: Should we optimize ConcurrentLongHashMap or stop use it?**

Pros:
1. Avoid boxing and unboxing
2. Cache friendly due to linear probing and compact memory
3. Auto shrink is a great feature

 Cons:
1. low throughput
2. static concurrent level
3. We have met multiple concurrent issues about ConcurrentLongHashMap:
https://github.com/apache/bookkeeper/issues/4684
https://github.com/apache/pulsar/pull/18390
https://github.com/apache/bookkeeper/pull/4317
https://github.com/apache/pulsar/pull/22604
4. We have to sync ConcurrentLongHashMap between bookkeeper and pulsar repo. 
And there is other classes has the same cons like ConcurrentLongLongPairHashMap.

GitHub link: https://github.com/apache/pulsar/discussions/25271

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to