If one executor fails, it moves the processing over to another executor.
However, if the data is lost, it re-executes the processing that generated
the data, and might have to go back to the source.Does this mean that only
those tasks that the dead executor was executing at the time need to be
rerun to generate the processing stages. If I am correct,  It uses RDD
lineage to figure out what needs to be re-executed. Remember we are talking
about the executor failure not node failure hereI don’t know the details
how it determines which tasks to run, but I am guessing that it is a
multi-stage job, it might have to rerun all the stages again. For example,
if you have done a groupBy, you will have 2 stages. After the first stage,
the data will be shuffled by hashing the groupBy key , so that data for the
same value of key lands in the same partition. Now, if one of those
partitions is lost during execution of the second stage, I am guessing
Spark will have to go back and re-execute all the tasks in the first stage.

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 20 Jun 2023 at 20:07, Nikhil Goyal <nownik...@gmail.com> wrote:

> Hi folks,
> When running Spark on K8s, what would happen to shuffle data if an
> executor is terminated or lost. Since there is no shuffle service, does all
> the work done by that executor gets recomputed?
>
> Thanks
> Nikhil
>

Reply via email to