Hi Kant Kodali,

Based on the input parameter to persist() method either it will be cached on 
memory or persisted to disk. In case of failures Spark will reconstruct the RDD 
on a different executor based on the DAG. That is how failures are handled. 
Spark Core does not replicate the RDDs as they can be reconstructed from the 
source (let’s say HDFS, Hive or S3 etc.) but not from memory (which is lost 
already).

Thanks,
Sreekanth Jella

From: kant kodali

Reply via email to