Checkpoint clears dependencies. You might need checkpoint to cut a long lineage in iterative algorithms. -Xiangrui
On Mon, Apr 21, 2014 at 11:34 AM, Diana Carroll <dcarr...@cloudera.com> wrote: > I'm trying to understand when I would want to checkpoint an RDD rather than > just persist to disk. > > Every reference I can find to checkpoint related to Spark Streaming. But > the method is defined in the core Spark library, not Streaming. > > Does it exist solely for streaming, or are there circumstances unrelated to > streaming in which I might want to checkpoint...and if so, like what? > > Thanks, > Diana