Spark, in general, is good for iterating through an entire dataset again
and again.  All operations are expressed in terms of iteration through all
the records of at least one partition.  You may want to look at IndexedRDD (
https://issues.apache.org/jira/browse/SPARK-2365) that aims to improve
point queries.  In general though, Spark is unlikely to outperform KV
stores because of the nature of scheduling a job for every operation.

On Wed, Oct 22, 2014 at 7:51 AM, Hajime Takase <placeofnomemor...@gmail.com>
wrote:

> Hi,
> Is it possible to use Spark as clustered key/value store ( say, like
> redis-cluster or hazelcast)?Will it out perform in write/read or other
> operation?
> My main urge is to use same RDD from several different SparkContext
> without saving to disk or using spark-job server,but I'm curious if someone
> has already tried using Spark like key/value store.
>
> Thanks,
>
> Hajime
>
>
>

Reply via email to