[jira] [Commented] (SPARK-47208) Allow overriding base overhead memory
[ https://issues.apache.org/jira/browse/SPARK-47208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821601#comment-17821601 ] Joao Correia commented on SPARK-47208: -- https://github.com/apache/spark/pull/45240 > Allow overriding base overhead memory > - > > Key: SPARK-47208 > URL: https://issues.apache.org/jira/browse/SPARK-47208 > Project: Spark > Issue Type: New Feature > Components: Kubernetes, Spark Core, YARN >Affects Versions: 3.5.1 >Reporter: Joao Correia >Priority: Major > Labels: pull-request-available > > We can already select the desired overhead memory directly via the > _'spark.driver/executor.memoryOverhead'_ flags, however, if that flag is not > present the overhead memory calculation goes as follows: > {code:java} > overhead_memory = Max(384, 'spark.driver/executor.memory' * > 'spark.driver/executor.memoryOverheadFactor') > where the 'memoryOverheadFactor' flag defaults to 0.1{code} > There are certain times where being able to override the 384Mb minimum > directly can be beneficial. We may have a scenario where a lot of off-heap > operations are performed (ex: using package managers/native > compression/decompression) where we don't have a need for a large JVM heap > but we may still need a signficant amount of memory in the spark node. > Using the '{_}memoryOverheadFactor{_}' flag may not prove appropriate. Since > we may not want the overhead allocation to directly scale with JVM memory, as > a cost saving/resource limitation problem. > As such, I propose the addition of a > 'spark.driver/executor.minMemoryOverhead' flag, which can be used to override > the 384Mib value used in the overhead calculation. > The memory overhead calculation will now be : > {code:java} > min_memory = > sparkConf.get('spark.driver/executor.minMemoryOverhead').getOrElse(384) > overhead_memory = Max(min_memory, 'spark.driver/executor.memory' * > 'spark.driver/executor.memoryOverheadFactor'){code} > PR: https://github.com/apache/spark/pull/45240 > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47208) Allow overriding base overhead memory
[ https://issues.apache.org/jira/browse/SPARK-47208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joao Correia updated SPARK-47208: - Component/s: Spark Core > Allow overriding base overhead memory > - > > Key: SPARK-47208 > URL: https://issues.apache.org/jira/browse/SPARK-47208 > Project: Spark > Issue Type: New Feature > Components: Kubernetes, Spark Core, YARN >Affects Versions: 3.5.1 >Reporter: Joao Correia >Priority: Major > Labels: pull-request-available > > We can already select the desired overhead memory directly via the > _'spark.driver/executor.memoryOverhead'_ flags, however, if that flag is not > present the overhead memory calculation goes as follows: > {code:java} > overhead_memory = Max(384, 'spark.driver/executor.memory' * > 'spark.driver/executor.memoryOverheadFactor') > where the 'memoryOverheadFactor' flag defaults to 0.1{code} > There are certain times where being able to override the 384Mb minimum > directly can be beneficial. We may have a scenario where a lot of off-heap > operations are performed (ex: using package managers/native > compression/decompression) where we don't have a need for a large JVM heap > but we may still need a signficant amount of memory in the spark node. > Using the '{_}memoryOverheadFactor{_}' flag may not prove appropriate. Since > we may not want the overhead allocation to directly scale with JVM memory, as > a cost saving/resource limitation problem. > As such, I propose the addition of a > 'spark.driver/executor.minMemoryOverhead' flag, which can be used to override > the 384Mib value used in the overhead calculation. > The memory overhead calculation will now be : > {code:java} > min_memory = > sparkConf.get('spark.driver/executor.minMemoryOverhead').getOrElse(384) > overhead_memory = Max(min_memory, 'spark.driver/executor.memory' * > 'spark.driver/executor.memoryOverheadFactor'){code} > PR: https://github.com/apache/spark/pull/45240 > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47208) Allow overriding base overhead memory
Joao Correia created SPARK-47208: Summary: Allow overriding base overhead memory Key: SPARK-47208 URL: https://issues.apache.org/jira/browse/SPARK-47208 Project: Spark Issue Type: New Feature Components: Kubernetes, YARN Affects Versions: 3.5.1 Reporter: Joao Correia We can already select the desired overhead memory directly via the _'spark.driver/executor.memoryOverhead'_ flags, however, if that flag is not present the overhead memory calculation goes as follows: {code:java} overhead_memory = Max(384, 'spark.driver/executor.memory' * 'spark.driver/executor.memoryOverheadFactor') where the 'memoryOverheadFactor' flag defaults to 0.1{code} There are certain times where being able to override the 384Mb minimum directly can be beneficial. We may have a scenario where a lot of off-heap operations are performed (ex: using package managers/native compression/decompression) where we don't have a need for a large JVM heap but we may still need a signficant amount of memory in the spark node. Using the '{_}memoryOverheadFactor{_}' flag may not prove appropriate. Since we may not want the overhead allocation to directly scale with JVM memory, as a cost saving/resource limitation problem. As such, I propose the addition of a 'spark.driver/executor.minMemoryOverhead' flag, which can be used to override the 384Mib value used in the overhead calculation. The memory overhead calculation will now be : {code:java} min_memory = sparkConf.get('spark.driver/executor.minMemoryOverhead').getOrElse(384) overhead_memory = Max(min_memory, 'spark.driver/executor.memory' * 'spark.driver/executor.memoryOverheadFactor'){code} PR: https://github.com/apache/spark/pull/45240 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org