Hello, I was looking for guidelines on what value to set executor memory to (via spark.executor.memory for example).
This seems to be important to avoid OOM during tasks, especially in no swap environments (like AWS EMR clusters). This setting is really about the executor JVM heap. Hence, in order to come up with the maximal amount of heap memory for the executor, we need to list: 1. the memory taken by other processes (Worker in standalone mode, ...) 2. all off-heap allocations in the executor Fortunately, for #1, we can just look at memory consumption without any application running. For #2, it is trickier. What I suspect we should account for: a. thread stack size b. akka buffers (via akka framesize & number of akka threads) c. kryo buffers d. shuffle buffers (e. tachyon) Could anyone shed some light on this? Maybe a formula? Or maybe swap should actually be turned on, as a safeguard against OOMs? Thanks