Calling cache/persist fails all our jobs (i have posted 2 threads on this).
And we're giving up hope in finding a solution. So I'd like to find a workaround for that: If I save an RDD to hdfs and read it back, can I use it in more than one operation? Example: (using cache) // do a whole bunch of transformations on an RDD myrdd.cache() val result1 = myrdd.map(op1(_)) val result2 = myrdd.map(op2(_)) // in the above I am assuming that a call to cache will prevent all previous transformation from being calculated twice I'd like to somehow get result1 and result2 without duplicating work. How can I do that? thanks Jeff