GitHub user berg223 edited a discussion: Should we continue to use 
ConcurrentLongHashMap in source code?

I have write a 
[benchmark](https://github.com/berg223/BenchMark/blob/main/src/main/java/apache/pulsar/map/BenchMarkRunner.java)
 for ConcurrentLongHashMap and ConcurrentHashMap. And the result shows 
ConcurrentHashMap has at least 1.6x throughput than ConcurrentLongHashMap. 
However,  ConcurrentHashMap do have more gc count. **Do you think it's a 
tradeoff between throughput and gc count?**

I have upload the benchmark result in attachment file 
[benchmark-concurrent-hashmap.json](https://github.com/user-attachments/files/25598801/benchmark-concurrent-hashmap.json).
 You can visialize the result in [JMH Visualizer](https://jmh.morethan.io/). 
And summary the result of throughput benchmark in the following table:

<table>
<tr><td>method</td>  <td>ConcurrentHashMap</td>         
<td>ConcurrentLongHashMap</td> </tr>
<tr><td>getAbsent</td>  <td>about 500,000,000 ops/s</td>         <td>about 
300,000,000 ops/s</td> </tr>
<tr><td>getPresent</td>  <td>about 500,000,000 ops/s</td>         <td>about 
300,000,000 ops/s</td> </tr>
<tr><td>putAbsent</td>  <td>about 14,000,000 ops/s</td>         <td> about 
5,000,000 ops/s</td> </tr>
<tr><td>putPresent</td>  <td>about 80,000,000 ops/s</td>         <td> about 
30,000,000 ops/s</td> </tr>
<tr><td>removeAbsent</td>  <td>about 220,000,000 ops/s</td>         <td> about 
33,000,000 ops/s</td> </tr>
<tr><td>removePresent</td>  <td>about 200,000,000 ops/s</td>         <td> about 
30,000,000 ops/s</td> </tr>
</table>

**My question: Should we optimize ConcurrentLongHashMap or stop use it?**

Pros:
1. Avoid boxing and unboxing
2. Cache friendly due to linear probing and compact memory
3. Auto shrink is a great feature

 Cons:
1. Lower throughput
2. Concurrent level is static
3. We have met multiple concurrent issues about ConcurrentLongHashMap:
https://github.com/apache/bookkeeper/issues/4684
https://github.com/apache/pulsar/pull/18390
https://github.com/apache/bookkeeper/pull/4317
https://github.com/apache/pulsar/pull/22604
4. We have to sync ConcurrentLongHashMap between bookkeeper and pulsar repo. 
And there is other classes has the same cons like ConcurrentLongLongPairHashMap.

IMO, we should replace ConcurrentLongHashMap with ConcurrentHashMap before we 
have a better implemention.

GitHub link: https://github.com/apache/pulsar/discussions/25271

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to