Re: Spark caching questions

2014-09-10 Thread Mayur Rustagi
Cached RDD do not survive SparkContext deletion (they are scoped on a per
sparkcontext basis).
I am not sure what you mean by disk based cache eviction, if you cache more
RDD than disk space the result will not be very pretty :)

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi https://twitter.com/mayur_rustagi


On Wed, Sep 10, 2014 at 4:43 AM, Vladimir Rodionov 
vrodio...@splicemachine.com wrote:

 Hi, users

 1. Disk based cache eviction policy? The same LRU?

 2. What is the scope of a cached RDD? Does it survive application? What
 happen if I run Java app next time? Will RRD be created or read from cache?

 If , answer is YES, then ...


 3. Is there are any way to invalidate cached RDD automatically? RDD
 partitions? Some API kind of : RDD.isValid()?

 4. HadoopRDD InputFormat - based. Some partitions (splits) may become
 invalid in cache. Can we reload only those partitions? Into cache?

 -Vladimir



Spark caching questions

2014-09-09 Thread Vladimir Rodionov
Hi, users

1. Disk based cache eviction policy? The same LRU?

2. What is the scope of a cached RDD? Does it survive application? What
happen if I run Java app next time? Will RRD be created or read from cache?

If , answer is YES, then ...


3. Is there are any way to invalidate cached RDD automatically? RDD
partitions? Some API kind of : RDD.isValid()?

4. HadoopRDD InputFormat - based. Some partitions (splits) may become
invalid in cache. Can we reload only those partitions? Into cache?

-Vladimir