The driver stores the meta-data associated with the partition, but the re-computation will occur on an executor. So if several partitions are lost, e.g. due to a few machines failing, the re-computation can be striped across the cluster making it fast.
On Wed, Apr 2, 2014 at 11:27 AM, David Thomas <dt5434...@gmail.com> wrote: > Can someone explain how RDD is resilient? If one of the partition is lost, > who is responsible to recreate that partition - is it the driver program? >