Just to correct the last sentence, if we end up starting a new instance of
Spark, I don't think it will be able to read the shuffle data from storage
from another instance, I stand corrected.
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Ki
Hi Maksym.
Let us understand the basics here first
My thoughtsSpark replicates the partitions among multiple nodes. If one
executor fails, it moves the processing over to the other executor.
However, if the data is lost, it re-executes the processing that generated
the data,
and might have to go b
Hey vaquar,
The link does't explain the crucial detail we're interested in - does executor
re-use the data that exists on a node from previous executor and if not, how
can we configure it to do so?
We are not running on kubernetes, so EKS/Kubernetes-specific advice isn't
very relevant.
We are ru
Following link you will get all required details
https://aws.amazon.com/blogs/containers/best-practices-for-running-spark-on-amazon-eks/
Let me know if you required further informations.
Regards,
Vaquar khan
On Mon, May 15, 2023, 10:14 PM Mich Talebzadeh
wrote:
> Couple of points
>
> Why
Couple of points
Why use spot or pre-empt intantes when your application as you stated
shuffles heavily.
Have you looked at why you are having these shuffles? What is the cause of
these large transformations ending up in shuffle
Also on your point:
"..then ideally we should expect that when an ex
Hello,
We've been in touch with a few spark specialists who suggested us a
potential solution to improve the reliability of our jobs that are shuffle
heavy
Here is what our setup looks like
- Spark version: 3.3.1
- Java version: 1.8
- We do not use external shuffle service
- We use s