Thanks!
On Thu, Oct 23, 2014 at 10:56 AM, Akshat Aranya aara...@gmail.com wrote:
Yes, that is a downside of Spark's design in general. The only way to
share data across consumers of data is by having a separate entity that
owns the Spark context. That's the idea behind Ooyala's job server. The
driver is still a single point of failure; if you lose the driver process,
you lose all information about the RDDs.
On Oct 22, 2014 6:33 PM, Hajime Takase placeofnomemor...@gmail.com
wrote:
Interesting.I see the interface of IndexedRDD, which seems to be like
key/value store of the certain SparkContext.
https://github.com/apache/spark/pull/1297
But the different SparkContext won't let their IndexedRDD to be used by
other ( I want to use multiple driver in my system)?
On Thu, Oct 23, 2014 at 1:01 AM, Akshat Aranya aara...@gmail.com wrote:
Spark, in general, is good for iterating through an entire dataset again
and again. All operations are expressed in terms of iteration through all
the records of at least one partition. You may want to look at IndexedRDD (
https://issues.apache.org/jira/browse/SPARK-2365) that aims to improve
point queries. In general though, Spark is unlikely to outperform KV
stores because of the nature of scheduling a job for every operation.
On Wed, Oct 22, 2014 at 7:51 AM, Hajime Takase
placeofnomemor...@gmail.com wrote:
Hi,
Is it possible to use Spark as clustered key/value store ( say, like
redis-cluster or hazelcast)?Will it out perform in write/read or other
operation?
My main urge is to use same RDD from several different SparkContext
without saving to disk or using spark-job server,but I'm curious if someone
has already tried using Spark like key/value store.
Thanks,
Hajime