[
https://issues.apache.org/jira/browse/SAMZA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160509#comment-14160509
]
Jay Kreps commented on SAMZA-428:
---------------------------------
Hey [~theduderog] yes that is actually how it works today. The write cache
batches all writes into a batch update. This allows us to avoid duplicate
writes and serialization overhead.
I think what you are asking is whether this batching should be pushed to the
user level. I actually think letting the user do their own batching is
preferable if there is no other benefit to having it in the framework--getting
a message at a time seems simpler to understand and more elegant to me and this
gives the most flexibility in how you would batch your external calls.
> Investigate: how to tune down caching in the KeyValueStore implementations
> --------------------------------------------------------------------------
>
> Key: SAMZA-428
> URL: https://issues.apache.org/jira/browse/SAMZA-428
> Project: Samza
> Issue Type: Improvement
> Components: kv
> Affects Versions: 0.8.0
> Reporter: Chinmay Soman
> Fix For: 0.8.0
>
>
> Currently, we have a 'CachedStore' layer on top of the KeyValueStore
> implementation that we use. This might lead to double caching:
> i) Once at the CachedStore layer
> ii) Possibly cached again in the specific K-V store that we use (for eg:
> RocksDB / BDB)
> We need the CachedStore layer so that the writes to LoggedStore (if
> configured) are done in an efficient manner.
> We can then potentially do some config tuning for the K-V store to reduce its
> memory footprint and simply write to disk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)