Re: Spark as key/value store?

Hajime Takase Wed, 22 Oct 2014 21:31:18 -0700

Thanks!

On Thu, Oct 23, 2014 at 10:56 AM, Akshat Aranya <aara...@gmail.com> wrote:


> Yes, that is a downside of Spark's design in general. The only way to
> share data across consumers of data is by having a separate entity that
> owns the Spark context. That's the idea behind Ooyala's job server. The
> driver is still a single point of failure; if you lose the driver process,
> you lose all information about the RDDs.
> On Oct 22, 2014 6:33 PM, "Hajime Takase" <placeofnomemor...@gmail.com>
> wrote:
>
>> Interesting.I see the interface of IndexedRDD, which seems to be like
>> key/value store of the certain SparkContext.
>> https://github.com/apache/spark/pull/1297
>> But the different SparkContext won't let their IndexedRDD to be used by
>> other ( I want to use multiple "driver" in my system)?
>>
>>
>>
>> On Thu, Oct 23, 2014 at 1:01 AM, Akshat Aranya <aara...@gmail.com> wrote:
>>
>>> Spark, in general, is good for iterating through an entire dataset again
>>> and again.  All operations are expressed in terms of iteration through all
>>> the records of at least one partition.  You may want to look at IndexedRDD (
>>> https://issues.apache.org/jira/browse/SPARK-2365) that aims to improve
>>> point queries.  In general though, Spark is unlikely to outperform KV
>>> stores because of the nature of scheduling a job for every operation.
>>>
>>> On Wed, Oct 22, 2014 at 7:51 AM, Hajime Takase <
>>> placeofnomemor...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> Is it possible to use Spark as clustered key/value store ( say, like
>>>> redis-cluster or hazelcast)?Will it out perform in write/read or other
>>>> operation?
>>>> My main urge is to use same RDD from several different SparkContext
>>>> without saving to disk or using spark-job server,but I'm curious if someone
>>>> has already tried using Spark like key/value store.
>>>>
>>>> Thanks,
>>>>
>>>> Hajime
>>>>
>>>>
>>>>
>>>
>>>
>>

Re: Spark as key/value store?

Reply via email to