Effectively this is persist without fault tolerance. Failure of any node means complete lack of fault tolerance. I would be very skeptical of truncating lineage if it is not reliable. On 17-May-2014 3:49 am, "Xiangrui Meng (JIRA)" <j...@apache.org> wrote:
> Xiangrui Meng created SPARK-1855: > ------------------------------------ > > Summary: Provide memory-and-local-disk RDD checkpointing > Key: SPARK-1855 > URL: https://issues.apache.org/jira/browse/SPARK-1855 > Project: Spark > Issue Type: New Feature > Components: MLlib, Spark Core > Affects Versions: 1.0.0 > Reporter: Xiangrui Meng > > > Checkpointing is used to cut long lineage while maintaining fault > tolerance. The current implementation is HDFS-based. Using the BlockRDD we > can create in-memory-and-local-disk (with replication) checkpoints that are > not as reliable as HDFS-based solution but faster. > > It can help applications that require many iterations. > > > > -- > This message was sent by Atlassian JIRA > (v6.2#6252) >