Hi, vtygoss

In my memory, the memoryOverhead in Spark 2.3 includes all the memories that 
are not executor onHeap memory, including the memory used by Spark 
offheapMemoryPool(executorOffHeapMemory, this concept also exists in Spark 
2.3),  PySparkWorker,  PipeRDD used,  netty memory pool,  JVM direct memory and 
so on.

In Spark 2.3, the size relationship between memoryOverhead and 
executorOffHeapMemory is not strongly dependent. For example, if the user 
configures executorMemory=1g , executoroffheapmemory=2g, and 
executormemoryoverhead=1g , this does not raise an error at the resource 
request stage and the request memory resource is 3g, but at least 4g is 
required.

In Spark 3.x, executorMemoryOverhead no longer includes the memory used by 
Spark offheapMemoryPool(executorOffHeapMemory), I think this can ensure that 
Spark offheapMemoryPool has enough memory.

Warm regards,
YangJie


发件人: vtygoss <vtyg...@126.com>
日期: 2022年8月25日 星期四 20:02
收件人: spark <dev@spark.apache.org>
主题: memory module of yarn container


Hi, community!



I notice a change about the memory module of yarn container between spark-2.3.0 
and spark-3.2.1 when requesting containers from yarn.



org.apache.spark.deploy.yarn.Client.java # verifyClusterResources



```

spark-2.3.0

val executorMem = executorMemory + executorMemoryOverhead

```



```

spark-3.2.1
val executorMem =
executorMemory + executorOffHeapMemory + executorMemoryOverhead + 
pysparkWorkerMemory
```

And i have these questions:

1. in spark-2.3.0 and spark-3.2.1, what is memoryOverhead and where is it used?
2. what is the difference between memoryOverhead and off-heap memory, native 
memory, direct memory? There is no such concept in apache flink, is it an 
unique concept of spark?
3. in spark-2.3.0, i think that memoryOverhead contains all non-heap memory, 
including off-heap / native / direct. Do i think wrong?

Thanks for your any replies.

Best Regards!


Reply via email to