[ 
https://issues.apache.org/jira/browse/KAFKA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790766#comment-16790766
 ] 

Guozhang Wang commented on KAFKA-8094:
--------------------------------------

[~ableegoldman] I'd agree with you:

1) the underlying store's iterator from built-in rocksDB is also on a snapshot 
of the store, so if more updates are applied by the stream thread, while the IQ 
caller thread is iterating the store, it will not be reflected either.
2) hence for the caching store, if some entries are added after the iterator's 
snapshot, it should be fine since as 1) dictates, the underlying store does not 
reflect latest image either; if some entries are evicted after the iterator's 
snapshot, then the merge-iterator of the cache / underlying should be able to 
cover it as well (it favors the entry in the cache iterator for the same keys).

> Iterating over cache with get(key) is inefficient 
> --------------------------------------------------
>
>                 Key: KAFKA-8094
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8094
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Sophie Blee-Goldman
>            Priority: Major
>              Labels: streams
>
> Currently, range queries in the caching layer are implemented by creating an 
> iterator over the subset of keys in the range, and calling get() on the 
> underlying TreeMap for each key. While this protects against 
> ConcurrentModificationException, we can improve performance by replacing the 
> TreeMap with a concurrent data structure such as ConcurrentSkipListMap and 
> then just iterating over a subMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to