[ 
https://issues.apache.org/jira/browse/SAMZA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152693#comment-15152693
 ] 

Nicolas Maquet commented on SAMZA-873:
--------------------------------------

[~nickpan47] I have already but didn't link it here. Her it its: 
https://reviews.apache.org/r/43589

> Avoid unnecessary flushes in CachedStore
> ----------------------------------------
>
>                 Key: SAMZA-873
>                 URL: https://issues.apache.org/jira/browse/SAMZA-873
>             Project: Samza
>          Issue Type: Improvement
>          Components: kv
>    Affects Versions: 0.10.0
>            Reporter: Nicolas Maquet
>            Assignee: Nicolas Maquet
>             Fix For: 0.10.1
>
>         Attachments: 
> 0001-SAMZA-873-Fix-CachedStore-to-not-call-flush-unnecess.patch
>
>
> The class {{org.apache.samza.storage.kv.CachedStore}} is currently calling 
> {{store.flush()}} when evicting dirty entries. This in turn causes RocksDB to 
> flush its memtables much more than necessary, causing slowdowns. 
> In a mixed put / get workload, e.g. 2 gets for 1 put with an object cache 
> size of 1000, RocksDB will flush its memtable roughly every 333 calls to 
> put(); that is every time the eldest entry from the cache is dirty. In our 
> benchmarks, this leads to a more than 20x drop in throughput.
> The attached patch fixes the issue as follows:
> - {{CachedStore.put()}} no longer flushes when evicting dirty entries. 
> It calls {{store.putAll()}} with all dirty entries and resets the dirty list 
> and count but does not call {{store.flush()}}.
> - Likewise, {{CachedStore.cache.removeEldestEntry()}} no longer flushes when 
> evicting dirty entries.
> It calls {{store.putAll()}} on all dirty entries and resets the dirty list 
> and count.
> - The behavior of {{CachedStore.flush()}} is unaffected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to