Just to correct the last sentence, if we end up starting a new instance of
Spark, I don't think it will be able to read the shuffle data from storage
from another instance, I stand corrected.
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Ki
Hi Maksym.
Let us understand the basics here first
My thoughtsSpark replicates the partitions among multiple nodes. If one
executor fails, it moves the processing over to the other executor.
However, if the data is lost, it re-executes the processing that generated
the data,
and might have to go b
Hey vaquar,
The link does't explain the crucial detail we're interested in - does executor
re-use the data that exists on a node from previous executor and if not, how
can we configure it to do so?
We are not running on kubernetes, so EKS/Kubernetes-specific advice isn't
very relevant.
We are ru