[ 
https://issues.apache.org/jira/browse/SAMZA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160509#comment-14160509
 ] 

Jay Kreps commented on SAMZA-428:
---------------------------------

Hey [~theduderog] yes that is actually how it works today. The write cache 
batches all writes into a batch update. This allows us to avoid duplicate 
writes and serialization overhead.

I think what you are asking is whether this batching should be pushed to the 
user level. I actually think letting the user do their own batching is 
preferable if there is no other benefit to having it in the framework--getting 
a message at a time seems simpler to understand and more elegant to me and this 
gives the most flexibility in how you would batch your external calls.

> Investigate: how to tune down caching in the KeyValueStore implementations
> --------------------------------------------------------------------------
>
>                 Key: SAMZA-428
>                 URL: https://issues.apache.org/jira/browse/SAMZA-428
>             Project: Samza
>          Issue Type: Improvement
>          Components: kv
>    Affects Versions: 0.8.0
>            Reporter: Chinmay Soman
>             Fix For: 0.8.0
>
>
> Currently, we have a 'CachedStore' layer on top of the KeyValueStore 
> implementation that we use. This might lead to double caching:
> i) Once at the CachedStore layer
> ii) Possibly cached again in the specific K-V store that we use (for eg: 
> RocksDB / BDB)
> We need the CachedStore layer so that the writes to LoggedStore (if 
> configured) are done in an efficient manner. 
> We can then potentially do some config tuning for the K-V store to reduce its 
> memory footprint and simply write to disk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to