Thanks! On Thu, Oct 23, 2014 at 10:56 AM, Akshat Aranya <aara...@gmail.com> wrote:
> Yes, that is a downside of Spark's design in general. The only way to > share data across consumers of data is by having a separate entity that > owns the Spark context. That's the idea behind Ooyala's job server. The > driver is still a single point of failure; if you lose the driver process, > you lose all information about the RDDs. > On Oct 22, 2014 6:33 PM, "Hajime Takase" <placeofnomemor...@gmail.com> > wrote: > >> Interesting.I see the interface of IndexedRDD, which seems to be like >> key/value store of the certain SparkContext. >> https://github.com/apache/spark/pull/1297 >> But the different SparkContext won't let their IndexedRDD to be used by >> other ( I want to use multiple "driver" in my system)? >> >> >> >> On Thu, Oct 23, 2014 at 1:01 AM, Akshat Aranya <aara...@gmail.com> wrote: >> >>> Spark, in general, is good for iterating through an entire dataset again >>> and again. All operations are expressed in terms of iteration through all >>> the records of at least one partition. You may want to look at IndexedRDD ( >>> https://issues.apache.org/jira/browse/SPARK-2365) that aims to improve >>> point queries. In general though, Spark is unlikely to outperform KV >>> stores because of the nature of scheduling a job for every operation. >>> >>> On Wed, Oct 22, 2014 at 7:51 AM, Hajime Takase < >>> placeofnomemor...@gmail.com> wrote: >>> >>>> Hi, >>>> Is it possible to use Spark as clustered key/value store ( say, like >>>> redis-cluster or hazelcast)?Will it out perform in write/read or other >>>> operation? >>>> My main urge is to use same RDD from several different SparkContext >>>> without saving to disk or using spark-job server,but I'm curious if someone >>>> has already tried using Spark like key/value store. >>>> >>>> Thanks, >>>> >>>> Hajime >>>> >>>> >>>> >>> >>> >>