[
https://issues.apache.org/jira/browse/DRILL-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810091#comment-17810091
]
ASF GitHub Bot commented on DRILL-8478:
---------------------------------------
paul-rogers commented on code in PR #2875:
URL: https://github.com/apache/drill/pull/2875#discussion_r1463921977
##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/AbstractHashBinaryRecordBatch.java:
##########
@@ -1312,7 +1312,9 @@ private void cleanup() {
}
// clean (and deallocate) each partition, and delete its spill file
for (HashPartition partn : partitions) {
- partn.close();
+ if (partn != null) {
+ partn.close();
+ }
Review Comment:
The above is OK as a work-around. I wonder, however, where the code added a
null pointer to the partition list. That should never happen. If it does, it
should be fixed at the point where the null pointer is added to the list.
Fixing it here is incomplete: there are other places where we loop through the
list, and those will also fail.
##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java:
##########
@@ -157,11 +162,11 @@ public HashPartition(FragmentContext context,
BufferAllocator allocator, Chained
.build(logger);
} catch (SchemaChangeException sce) {
throw new IllegalStateException("Unexpected Schema Change while creating
a hash table",sce);
- }
- this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator);
- tmpBatchesList = new ArrayList<>();
- if (numPartitions > 1) {
- allocateNewCurrentBatchAndHV();
+ } catch (OutOfMemoryException oom) {
+ close();
Review Comment:
This call is _probably_ fine. However, the design is that if any operator
fails, the entire operator stack is closed. So, `close()` should be called by
the fragment executor. There is probably no harm in calling `close()` here, as
long as the `close()` method is safe to call twice.
If the fragment executor _does not_ call close when the failure occurs
during setup, then there is a bug since failing to call `close()` results in
just this kind of error.
> HashPartition memory leak when exception
> -----------------------------------------
>
> Key: DRILL-8478
> URL: https://issues.apache.org/jira/browse/DRILL-8478
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.21.1
> Reporter: shihuafeng
> Priority: Major
> Fix For: 1.21.2
>
> Attachments:
> 0001-DRILL-8478.-HashPartition-memory-leak-when-it-alloca.patch
>
>
> *Describe the bug*
> hashpartition leak when allocate memory exception with OutOfMemoryException
> *To Reproduce*
> Steps to reproduce the behavior:
> # prepare data for tpch 1s
> # 20 concurrent for tpch sql8
> # set direct memory 5g
> # when it had OutOfMemoryException , stopped all sql.
> # finding memory leak
> *Expected behavior*
> (1)i set \{DRILL_MAX_DIRECT_MEMORY:-"5G"}
> (2) i run sql8 (sql detail as Additional context) with 20 concurrent
> (3) it had OutOfMemoryException when create hashPartion
> *Error detail, log output or screenshots*
> Unable to allocate buffer of size 262144 (rounded from 262140) due to memory
> limit (41943040). Current allocation: 20447232
>
> sql
> {code:java}
> // code placeholder
> select o_year, sum(case when nation = 'CHINA' then volume else 0 end) /
> sum(volume) as mkt_share from ( select extract(year from o_orderdate) as
> o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from
> hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem,
> hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1,
> hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and
> s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey
> and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name
> = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date
> '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as
> all_nations group by o_year order by o_year
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)