[jira] [Commented] (SPARK-1855) Provide memory-and-local-disk RDD checkpointing

koert kuipers (JIRA) Thu, 24 Jul 2014 15:50:34 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073784#comment-14073784
 ]


koert kuipers commented on SPARK-1855:
--------------------------------------

i think this makes sense. we have iterative queries that should be very quick. 
in case of machine failure i am ok if query fails, we will simply repeat. so i 
do not care about checkpoint to disk in this situation. but i do care about 
checkpoint to memory to cut my dependencies, which means they get garbage 
collected and cached rdds get cleaned up.

> Provide memory-and-local-disk RDD checkpointing
> -----------------------------------------------
>
>                 Key: SPARK-1855
>                 URL: https://issues.apache.org/jira/browse/SPARK-1855
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib, Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Xiangrui Meng
>
> Checkpointing is used to cut long lineage while maintaining fault tolerance. 
> The current implementation is HDFS-based. Using the BlockRDD we can create 
> in-memory-and-local-disk (with replication) checkpoints that are not as 
> reliable as HDFS-based solution but faster.
> It can help applications that require many iterations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1855) Provide memory-and-local-disk RDD checkpointing

Reply via email to