This is an automated email from the ASF dual-hosted git repository.

chengchengjin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git


The following commit(s) were added to refs/heads/main by this push:
     new ca2ab6ad7d [DOC] Add doc about experimental feature using off-heap to 
store broadcast build relation (#8882)
ca2ab6ad7d is described below

commit ca2ab6ad7d9c461b7ca1eb7b032f460ce4d567ca
Author: Terry Wang <[email protected]>
AuthorDate: Tue Mar 4 17:11:35 2025 +0800

    [DOC] Add doc about experimental feature using off-heap to store broadcast 
build relation (#8882)
---
 docs/Configuration.md     |  1 +
 docs/get-started/Velox.md | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/docs/Configuration.md b/docs/Configuration.md
index 86ee06bfb8..6c452ef86b 100644
--- a/docs/Configuration.md
+++ b/docs/Configuration.md
@@ -98,6 +98,7 @@ The following configurations are related to Velox settings.
 | spark.gluten.velox.fs.s3a.connect.timeout                            | 
Timeout for AWS s3 connection.                                                  
                                                                   | 1s         
       |
 | spark.gluten.sql.columnar.backend.velox.orc.scan.enabled             | 
Enable velox orc scan. If disabled, vanilla spark orc scan will be used.        
                                                                   | true       
       |
 | spark.gluten.sql.complexType.scan.fallback.enabled                   | Force 
fallback for complex type scan, including struct, map, array.                   
                                                             | true             
 |
+| spark.gluten.velox.offHeapBroadcastBuildRelation.enabled             | 
Experimental: If enabled, broadcast build relation will use offheap memory. 
Otherwise, broadcast build relation will use onheap memory, default value is 
false |                   |
 
 Additionally, you can control the configurations of gluten at thread level by 
local property.
 
diff --git a/docs/get-started/Velox.md b/docs/get-started/Velox.md
index d7e93e3f92..00cb431e17 100644
--- a/docs/get-started/Velox.md
+++ b/docs/get-started/Velox.md
@@ -545,6 +545,30 @@ I20231121 10:19:42.348845 90094332 
WholeStageResultIterator.cc:220] Native Plan
       queuedWallNanos              sum: 2.00us, count: 1, min: 2.00us, max: 
2.00us
 ```
 
+
+## Broadcast Build Relations to Off-Heap(Experimental)
+
+The experimental feature **Off-Heap Broadcast Build Relations** aims to 
mitigate out-of-memory (OOM) issues caused by heap memory consumption during 
broadcast operations. Detailed design
+can be found 
[here](https://docs.google.com/document/d/1eZNWPUEdiz2JPJfhyVn9hrk6SqJFRNzOMZm6u5Yredk/edit?tab=t.0)
+
+### Purpose & how it works
+- **Avoid OOM**: Prevent OOM errors when broadcasting large datasets.
+- **Reduce Heap Memory Usage**: Store broadcast build relations in Spark 
off-heap memory instead of on-heap memory
+
+### Configuration
+
+### Enable Off-Heap Broadcast
+To enable this feature, you can set the following Spark configuration:
+
+| Property                                                    | Default | 
Description                                                       |
+|-------------------------------------------------------------|---------|-------------------------------------------------------------------|
+| `spark.gluten.velox.offHeapBroadcastBuildRelation.enabled`  | `false` | 
Enable/disable off-heap storage for broadcast build relations.    |
+
+This feature has been tested through a series of tests, and we are collecting 
more feedback from users. If you have memory problem on broadcast build 
relations, please try this feature and give more feedbacks.
+
+**Note**: This feature will become the default behavior once stabilized. Stay 
tuned for updates!
+
+
 # Accelerators
 
 Please refer [HBM](VeloxHBM.md) [QAT](VeloxQAT.md) [IAA](VeloxIAA.md) for 
details


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to