wenwj0 opened a new issue, #10010:
URL: https://github.com/apache/incubator-gluten/issues/10010

   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   When I run the SQL like below, an OOM error occurs.
   ```sql
   select 20250530 ,key_id2, count(distinct key_id3)
   from 
   (
       select *
       from xxxtable1
       where dt between '20180101' and '20250530' 
   ) a
   left join 
   (
       select * 
       from xxxtable2
       where ds between '20180101' and '20250530'
   ) b on lower(a.key_id1)=lower(b.key_id1) and a.key_id2=b.key_id2
    group by key_id2
   ```
   Error msg is :  
   `ExecutorLostFailure (executor 43 exited caused by one of the running tasks) 
Reason: Container killed by YARN for exceeding physical memory limits. 6.0 GB 
of 6 GB physical memory used. Consider boosting spark.executor.memoryOverhead.`
   
   The data size of  scan in these two tables is about 1 TB. I tried to use 
shufflehashjoin and sortmergerjoin respectively, but they were failed. The same 
SQL can be run successfully in vanilla spark.
   
   The failed stage is join, I suspect it has something wrong with spill.
   
   <img width="726" alt="Image" 
src="https://github.com/user-attachments/assets/4fd8c52a-b2a3-4355-bd8b-27e7e38a7e39";
 />
   
   
   ### Gluten version
   
   Gluten-1.3
   
   ### Spark version
   
   Spark-3.2.x
   
   ### Spark configurations
   
   spark.memory.offHeap.enabled=true;
   spark.memory.offHeap.size=3g;
   spark.yarn.executor.memoryOverhead=2g;
   spark.executor.memory=1g;
   spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager;
   spark.sql.shuffle.partitions=500;
   spark.io.compression.codec=zstd;
   
   ### System information
   
   Gluten Version: 1.3.0
   Commit: 98546a6d62e889d792d44715d90b1bf92f2e74e3
   CMake Version: 3.28.3
   System: Linux-4.9.0-14-amd64
   Arch: x86_64
   CPU Name: Model name:          Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
   C++ Compiler: /usr/bin/c++
   C++ Compiler Version: 11.5.0
   C Compiler: /usr/bin/cc
   C Compiler Version: 11.5.0
   CMake Prefix Path: 
/usr/local;/usr;/;/usr/local;/usr/local;/usr/X11R6;/usr/pkg;/opt
   
   ### Relevant logs
   
   ```bash
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to