[ 
https://issues.apache.org/jira/browse/KAFKA-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608923#comment-17608923
 ] 

Avi Cherry commented on KAFKA-8802:
-----------------------------------

Sorry to have to re-open this, but unfortunately copying the keyset view when 
creating the InMemoryKeyValueIterator doesn't stop the 
ConcurrentModificationException, but rather shifts when it happens to be while 
performing the copy during the creation of the InMemoryKeyValueIterator, 
instead of during the iteration. This might have made it slightly less likely 
to be triggered, but I just encountered it personally in the wild. Until this 
is fixed I will need to avoid using an InMemoryKeyValueStore in any place where 
it's accessed via {{KakfaStreams.store().}}

The performance hit for ConcurrentSkipListMap vs TreeMap from other sources I 
can find reports it at around half as fast. On the other hand, the existing 
version also makes an entire copy of the iterated data into a new TreeSet which 
takes time and especially unnecessary space. Another way you might look at this 
is that the InMemoryKeyValueStore would merely operate as fast as the fastest 
available (open-source) map implementation in Java that's both sorted and 
concurrent. If someone really, really needed that extra performance out of it 
they could license 
[AirConcurrentMap|[https://github.com/boilerbay/airconcurrentmap]] and 
reimplement InMemoryKeyValueStore using that. Heck, we could make the 
ConcurrentNavigableMap implementation configurable in Kafka Streams if it was 
important enough. Another potential option is to provide the option of creating 
a higher-performance in memory store using the TreeMap as long as it's 
guaranteed to only be used from a single thread.

The new issue is at KAFKA-14260

> ConcurrentSkipListMap shows performance regression in cache and in-memory 
> store
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-8802
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8802
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.3.0
>            Reporter: A. Sophie Blee-Goldman
>            Assignee: A. Sophie Blee-Goldman
>            Priority: Major
>             Fix For: 2.4.0, 2.3.1
>
>
> The use of ConcurrentSkipListMap in the cache and in-memory stores caused a 
> performance regression in 2.3.0. We should revert back to using TreeMapĀ 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to