It would be nice if the RDD cache() method incorporate a depth information.
That is,
void test()
{
JavaRDD. rdd = .;
rdd.cache(); // to depth 1. actual caching happens.
rdd.cache(); // to depth 2. Nop as long as the storage level is the same.
Else, exception.
.
rdd.uncache(); // to
This is a pretty cool idea — instead of cache depth I’d call it something like
reference counting. Would you mind opening a JIRA issue about it?
The issue of really composing together libraries that use RDDs nicely isn’t
fully explored, but this is certainly one thing that would help with it.
Opened a JIRA issue. (https://issues.apache.org/jira/browse/SPARK-1962)
Thanks.
-Original Message-
From: Matei Zaharia [mailto:matei.zaha...@gmail.com]
Sent: Thursday, May 29, 2014 3:54 PM
To: dev@spark.apache.org
Subject: Re: Suggestion: RDD cache depth
This is a pretty cool idea