Re: Spark as key/value store?

2014-10-22 Thread Akshat Aranya
Spark, in general, is good for iterating through an entire dataset again and again. All operations are expressed in terms of iteration through all the records of at least one partition. You may want to look at IndexedRDD ( https://issues.apache.org/jira/browse/SPARK-2365) that aims to improve

Re: Spark as key/value store?

2014-10-22 Thread Hajime Takase
Thanks! On Thu, Oct 23, 2014 at 10:56 AM, Akshat Aranya aara...@gmail.com wrote: Yes, that is a downside of Spark's design in general. The only way to share data across consumers of data is by having a separate entity that owns the Spark context. That's the idea behind Ooyala's job server.