With 3x replication, we should be able to achieve fault tolerance.
This checkPointed RDD can be cleared if we have another in-memory
checkPointed RDD down the line. It can avoid hitting disk if we have
enough memory to use. We need to investigate more to find a good
solution. -Xiangrui

On Fri, May 16, 2014 at 4:00 PM, Mridul Muralidharan <mri...@gmail.com> wrote:
> Effectively this is persist without fault tolerance.
> Failure of any node means complete lack of fault tolerance.
> I would be very skeptical of truncating lineage if it is not reliable.
>  On 17-May-2014 3:49 am, "Xiangrui Meng (JIRA)" <j...@apache.org> wrote:
>
>> Xiangrui Meng created SPARK-1855:
>> ------------------------------------
>>
>>              Summary: Provide memory-and-local-disk RDD checkpointing
>>                  Key: SPARK-1855
>>                  URL: https://issues.apache.org/jira/browse/SPARK-1855
>>              Project: Spark
>>           Issue Type: New Feature
>>           Components: MLlib, Spark Core
>>     Affects Versions: 1.0.0
>>             Reporter: Xiangrui Meng
>>
>>
>> Checkpointing is used to cut long lineage while maintaining fault
>> tolerance. The current implementation is HDFS-based. Using the BlockRDD we
>> can create in-memory-and-local-disk (with replication) checkpoints that are
>> not as reliable as HDFS-based solution but faster.
>>
>> It can help applications that require many iterations.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.2#6252)
>>

Reply via email to