weijunlu opened a new issue, #2871: URL: https://github.com/apache/drill/issues/2871
Drill version: <version>1.22.0-SNAPSHOT</version> **Describe the bug** the mergejoin memery leak. **To Reproduce** 1. Configure direct memory, In the drill-env.sh file in the conf directory add: export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"1G"} 2. drill hive storage plugin configuration: { "type": "hive", "configProps": { "hive.metastore.sasl.enabled": "false", "fs.default.name": "hdfs://xxx:9000", "datanucleus.schema.autoCreateAll": "true", "javax.jdo.option.ConnectionURL": "jdbc:mysql://xxx:3306/tpch?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=utf8&useSSL=false", "javax.jdo.option.ConnectionUserName": "root", "javax.jdo.option.ConnectionPassword": "******", "hive.metastore.uris": "thrift://xxx:9083", "hive.metastore.schema.verification": "false", "hive.metastore.warehouse.dir": "/user/hive/warehouse", "hive.jdbc.driver": "org.apache.hive.jdbc.HiveDriver", "hive.jdbc.url": "jdbc:hive2://xxx:10000", "hive.jdbc.user": "root", "hive.jdbc.password": "******", "hive.dml.mode": "jdbc", "writable": "true" }, "enabled": true, "authMode": "SHARED_USER" } 3. Create eight tpch tables, use tpch tool to generate 1s data, and import data into hive. See the attachment for details. 4. Write the execution script, run the tpche sql statements in 5 concurrency: shell script: [root@jupiterdev memeryleak]# cat para_power_test.sh #!/bin/bash drill_home=/opt/apache-drill-1.22.0-SNAPSHOT/bin random_sql(){ while true do $drill_home/sqlline -u \"jdbc:drill:zk=localhost:2181\" -f tpch_sql8.sql >> log/tpch1s_sql8.log 2>&1 done } main(){ unset HADOOP_CLASSPATH #sleep 2h #TPCH power test for i in `seq 1 5` do random_sql & done } main $@ the sql as follows: select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from hive.tpch.part, hive.tpch.supplier, hive.tpch.lineitem, hive.tpch.orders, hive.tpch.customer, hive.tpch.nation n1, hive.tpch.nation n2, hive.tpch.region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as all_nations group by o_year order by o_year; 5. Execute the script and view the sqlline running status: [root@jupiterdev memeryleak]# ./para_power_test.sh [root@jupiterdev memeryleak]# jps | grep SqlLine 5638 SqlLine 5799 SqlLine 5705 SqlLine 5709 SqlLine 5584 SqlLine 6. View drill's direct memory through the 8047 page: <img width="959" alt="1" src="https://github.com/apache/drill/assets/104422448/e6331b33-c6c9-4f4f-8c6a-7334b46f6b69"> <img width="959" alt="2" src="https://github.com/apache/drill/assets/104422448/162500be-6a59-4a18-8b73-79a6bac8279d"> 7. After running for a while, an sql statement error is detected, and then pkill -f para_power_test.sh 8. Check again that there is no sqlline execution, bug the direct memery is not fully release. <img width="959" alt="4" src="https://github.com/apache/drill/assets/104422448/9f9724f8-6210-48c4-a1ca-132215d2602f"> **Expected behavior** Expected that the direct memery can fully release. **Additional context** update the planner.enable_hashjoin option to false through the 8047 page <img width="562" alt="1705471433736" src="https://github.com/apache/drill/assets/104422448/a12ed78b-be76-44a7-b75a-08828440983e"> [hive.txt](https://github.com/apache/drill/files/13959322/hive.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org