[I] Dynamically sizing off-heap memory [incubator-gluten]

via GitHub Wed, 17 Apr 2024 06:35:48 -0700


supermem613 opened a new issue, #5438:
URL: https://github.com/apache/incubator-gluten/issues/5438


   ### Description
   
   When using Gluten with Velox and Spark, today we specify the off-heap memory 
size and accordingly adjust the on-heap memory as well. In practice, this means 
that the amount of memory we set aside for on-heap cannot be used for off-heap 
and vice-versa, which can lead to situations where we are not optimally using 
the machine's memory since we may be doing processing mostly using on-heap or 
off-heap memory, but rarely both at the same time in great quantities.
   
   This is particularly painful, for example, when we fall back execution to 
"vanilla" Spark.
   
   For example, for a 64GB machine where we want to use 56GB of memory for 
Spark, we would set on-heap memory (via the spark.executor.memory setting) to, 
say, 14GB and set the off-heap (via the spark.memory.offHeap.size) to 42GB. In 
this case, if we fallback execution to Spark, we will be constrained by the 
14GB of on-heap memory. If we don't fall back, we are using up to 42GB, leaving 
a fair number of unused GBs of memory that could be used.
   
   We propose to leverage the existing off-heap allocation tracking in Gluten, 
paired with JDK APIs (Runtime.getRuntime().totalMemory() and freeMemory() APIs) 
that show on-heap utilization to provide unified memory managed utilization 
control. However, it is important to notice that this approach does not 
actively control Java allocations, so it can in practice allow some over 
subscription of memory to happen until a native allocation comes along and is 
failed accordingly.
   
   From a configuration perspective, there will be a new gluten Boolean 
configuration to turn on this new feature, which in turn obviates any off-heap 
configuration. This means that the setting for off-heap enabling and sizing 
will no longer be used. Instead, we will continue to configure the executor 
memory – the on-heap sizing – to use as much memory as possible, as is done 
today with "vanilla" Spark.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Dynamically sizing off-heap memory [incubator-gluten]

Reply via email to