Hi Kant Kodali, Based on the input parameter to persist() method either it will be cached on memory or persisted to disk. In case of failures Spark will reconstruct the RDD on a different executor based on the DAG. That is how failures are handled. Spark Core does not replicate the RDDs as they can be reconstructed from the source (let’s say HDFS, Hive or S3 etc.) but not from memory (which is lost already).
Thanks, Sreekanth Jella From: kant kodali