The following is copied from the paper, is something related with rdd lineage. 
Is there a unit test that covers this scenario(rdd partition lost and recovery)?
Thanks. 

If a partition of an RDD is lost, the RDD has enough information about how it 
was derived from other RDDs to recompute 
just that partition. Thus, lost data can be recovered, often quite quickly, 
without requiring costly replication.



bit1...@163.com
 
From: bit1...@163.com
Date: 2015-07-31 13:11
To: Tathagata Das; yuzhihong
CC: user
Subject: Re: Re: How RDD lineage works
Thanks TD and Zhihong for the guide. I will check it




bit1...@163.com
 
From: Tathagata Das
Date: 2015-07-31 12:27
To: Ted Yu
CC: bit1...@163.com; user
Subject: Re: How RDD lineage works
You have to read the original Spark paper to understand how RDD lineage works. 
https://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf

On Thu, Jul 30, 2015 at 9:25 PM, Ted Yu <yuzhih...@gmail.com> wrote:
Please take a look at:
core/src/test/scala/org/apache/spark/CheckpointSuite.scala

Cheers

On Thu, Jul 30, 2015 at 7:39 PM, bit1...@163.com <bit1...@163.com> wrote:
Hi,

I don't get a good understanding how RDD lineage works, so I would ask whether 
spark provides a unit test in the code base to illustrate how RDD lineage works.
If there is, What's the class name is it? 
Thanks!



bit1...@163.com


Reply via email to