[jira] [Commented] (SPARK-47208) Allow overriding base overhead memory

2024-02-28 Thread Joao Correia (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821601#comment-17821601
 ] 

Joao Correia commented on SPARK-47208:
--

https://github.com/apache/spark/pull/45240

> Allow overriding base overhead memory
> -
>
> Key: SPARK-47208
> URL: https://issues.apache.org/jira/browse/SPARK-47208
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes, Spark Core, YARN
>Affects Versions: 3.5.1
>Reporter: Joao Correia
>Priority: Major
>  Labels: pull-request-available
>
> We can already select the desired overhead memory directly via the 
> _'spark.driver/executor.memoryOverhead'_ flags, however, if that flag is not 
> present the overhead memory calculation goes as follows:
> {code:java}
> overhead_memory = Max(384, 'spark.driver/executor.memory' * 
> 'spark.driver/executor.memoryOverheadFactor')
> where the 'memoryOverheadFactor' flag defaults to 0.1{code}
> There are certain times where being able to override the 384Mb minimum 
> directly can be beneficial. We may have a scenario where a lot of off-heap 
> operations are performed (ex: using package managers/native 
> compression/decompression) where we don't have a need for a large JVM heap 
> but we may still need a signficant amount of memory in the spark node. 
> Using the '{_}memoryOverheadFactor{_}' flag may not prove appropriate. Since 
> we may not want the overhead allocation to directly scale with JVM memory, as 
> a cost saving/resource limitation problem.
> As such, I propose the addition of a 
> 'spark.driver/executor.minMemoryOverhead' flag, which can be used to override 
> the 384Mib value used in the overhead calculation.
> The memory overhead calculation will now be :
> {code:java}
> min_memory = 
> sparkConf.get('spark.driver/executor.minMemoryOverhead').getOrElse(384)
> overhead_memory = Max(min_memory, 'spark.driver/executor.memory' * 
> 'spark.driver/executor.memoryOverheadFactor'){code}
> PR: https://github.com/apache/spark/pull/45240  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47208) Allow overriding base overhead memory

2024-02-28 Thread Joao Correia (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joao Correia updated SPARK-47208:
-
Component/s: Spark Core

> Allow overriding base overhead memory
> -
>
> Key: SPARK-47208
> URL: https://issues.apache.org/jira/browse/SPARK-47208
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes, Spark Core, YARN
>Affects Versions: 3.5.1
>Reporter: Joao Correia
>Priority: Major
>  Labels: pull-request-available
>
> We can already select the desired overhead memory directly via the 
> _'spark.driver/executor.memoryOverhead'_ flags, however, if that flag is not 
> present the overhead memory calculation goes as follows:
> {code:java}
> overhead_memory = Max(384, 'spark.driver/executor.memory' * 
> 'spark.driver/executor.memoryOverheadFactor')
> where the 'memoryOverheadFactor' flag defaults to 0.1{code}
> There are certain times where being able to override the 384Mb minimum 
> directly can be beneficial. We may have a scenario where a lot of off-heap 
> operations are performed (ex: using package managers/native 
> compression/decompression) where we don't have a need for a large JVM heap 
> but we may still need a signficant amount of memory in the spark node. 
> Using the '{_}memoryOverheadFactor{_}' flag may not prove appropriate. Since 
> we may not want the overhead allocation to directly scale with JVM memory, as 
> a cost saving/resource limitation problem.
> As such, I propose the addition of a 
> 'spark.driver/executor.minMemoryOverhead' flag, which can be used to override 
> the 384Mib value used in the overhead calculation.
> The memory overhead calculation will now be :
> {code:java}
> min_memory = 
> sparkConf.get('spark.driver/executor.minMemoryOverhead').getOrElse(384)
> overhead_memory = Max(min_memory, 'spark.driver/executor.memory' * 
> 'spark.driver/executor.memoryOverheadFactor'){code}
> PR: https://github.com/apache/spark/pull/45240  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47208) Allow overriding base overhead memory

2024-02-28 Thread Joao Correia (Jira)
Joao Correia created SPARK-47208:


 Summary: Allow overriding base overhead memory
 Key: SPARK-47208
 URL: https://issues.apache.org/jira/browse/SPARK-47208
 Project: Spark
  Issue Type: New Feature
  Components: Kubernetes, YARN
Affects Versions: 3.5.1
Reporter: Joao Correia


We can already select the desired overhead memory directly via the 
_'spark.driver/executor.memoryOverhead'_ flags, however, if that flag is not 
present the overhead memory calculation goes as follows:
{code:java}
overhead_memory = Max(384, 'spark.driver/executor.memory' * 
'spark.driver/executor.memoryOverheadFactor')

where the 'memoryOverheadFactor' flag defaults to 0.1{code}

There are certain times where being able to override the 384Mb minimum directly 
can be beneficial. We may have a scenario where a lot of off-heap operations 
are performed (ex: using package managers/native compression/decompression) 
where we don't have a need for a large JVM heap but we may still need a 
signficant amount of memory in the spark node. 

Using the '{_}memoryOverheadFactor{_}' flag may not prove appropriate. Since we 
may not want the overhead allocation to directly scale with JVM memory, as a 
cost saving/resource limitation problem.

As such, I propose the addition of a 'spark.driver/executor.minMemoryOverhead' 
flag, which can be used to override the 384Mib value used in the overhead 
calculation.

The memory overhead calculation will now be :
{code:java}
min_memory = 
sparkConf.get('spark.driver/executor.minMemoryOverhead').getOrElse(384)

overhead_memory = Max(min_memory, 'spark.driver/executor.memory' * 
'spark.driver/executor.memoryOverheadFactor'){code}

PR: https://github.com/apache/spark/pull/45240  

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org