Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15863


Change subject: IMPALA-9712: fix mem consumption of operators above selective 
scan
......................................................................

IMPALA-9712: fix mem consumption of operators above selective scan

This change is motivated by excessive memory consumption of
TPC-H Q19 which has a hash join and non-grouping aggregate
above a selective scan.

There are two related parts to this change.

First, we fix RowBatch::AtCapacity() to account for the actual
memory consumed by the RowBatch. It used total_allocated_bytes(),
which does *not* account for unused space in the MemPool chunks.
Instead it now uses total_reserved_bytes(), which includes the
whole chunks. This reduced memory consumption of the agg from
60+MB to ~16MB.

Second, we make PartitionedHashJoinNode flush memory a bit more
aggressively by exiting loops when a small amount of memory
is accumulated in an empty batch. This reduced memory consumption
of the agg further from ~16MB to ~8MB.

Testing:
Ran TPC-H Q19 on parquet with mt_dop=8.  Aggregation mem usage was
reduced from 60+MB to ~8MB.

Performance:
No significant change on TPC-H single node run.

+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(30) | parquet / none / none | 6.15    | -0.39%     | 4.52       | -0.45% 
        |
+----------+-----------------------+---------+------------+------------+----------------+

+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+
| Workload | Query    | File Format           | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+
| TPCH(30) | TPCH-Q2  | parquet / none / none | 2.82   | 2.80        |   +0.79% 
  |   2.36%    |   2.50%        | 40    |   +1.59%       | 1.33    | 1.45  |
| TPCH(30) | TPCH-Q8  | parquet / none / none | 5.29   | 5.26        |   +0.49% 
  |   1.72%    |   1.73%        | 40    |   +0.78%       | 1.50    | 1.26  |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 13.78  | 13.76       |   +0.18% 
  |   1.51%    |   1.64%        | 40    |   +0.32%       | 0.60    | 0.51  |
| TPCH(30) | TPCH-Q16 | parquet / none / none | 1.80   | 1.80        |   +0.31% 
  |   2.95%    |   2.24%        | 40    |   +0.09%       | 1.27    | 0.53  |
| TPCH(30) | TPCH-Q21 | parquet / none / none | 22.26  | 22.24       |   +0.07% 
  |   1.86%    |   1.83%        | 40    |   +0.17%       | 0.56    | 0.16  |
| TPCH(30) | TPCH-Q11 | parquet / none / none | 1.11   | 1.11        |   +0.13% 
  |   5.75%    |   3.68%        | 40    |   -0.13%       | -0.71   | 0.12  |
| TPCH(30) | TPCH-Q7  | parquet / none / none | 4.47   | 4.48        |   -0.15% 
  |   1.37%    |   1.86%        | 40    |   +0.01%       | 0.10    | -0.40 |
| TPCH(30) | TPCH-Q19 | parquet / none / none | 4.04   | 4.05        |   -0.22% 
  |   1.99%    |   2.13%        | 40    |   -0.03%       | -0.55   | -0.48 |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 1.98   | 1.98        |   -0.25% 
  |   2.58%    |   3.10%        | 40    |   -0.04%       | -0.52   | -0.39 |
| TPCH(30) | TPCH-Q12 | parquet / none / none | 3.17   | 3.19        |   -0.42% 
  |   2.71%    |   1.73%        | 40    |   -0.11%       | -0.84   | -0.82 |
| TPCH(30) | TPCH-Q3  | parquet / none / none | 3.96   | 3.98        |   -0.47% 
  |   1.85%    |   1.52%        | 40    |   -0.17%       | -1.21   | -1.25 |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 5.25   | 5.29        |   -0.81% 
  |   2.11%    |   6.02%        | 40    |   +0.08%       | 0.54    | -0.80 |
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.63   | 1.64        |   -0.69% 
  |   2.81%    |   2.72%        | 40    |   -0.07%       | -0.75   | -1.13 |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 9.79   | 9.87        |   -0.79% 
  |   1.17%    |   0.94%        | 40    |   -0.61%       | -2.92   | -3.33 |
| TPCH(30) | TPCH-Q10 | parquet / none / none | 7.89   | 7.91        |   -0.24% 
  | * 13.08% * | * 11.07% *     | 40    |   -1.16%       | -1.34   | -0.09 |
| TPCH(30) | TPCH-Q18 | parquet / none / none | 14.07  | 13.79       |   +2.04% 
  | * 29.12% * | * 19.15% *     | 40    |   -3.46%       | -3.14   | 0.36  |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 3.77   | 3.79        |   -0.66% 
  |   1.56%    |   1.48%        | 40    |   -0.82%       | -2.19   | -1.96 |
| TPCH(30) | TPCH-Q14 | parquet / none / none | 3.62   | 3.63        |   -0.27% 
  |   4.40%    |   2.64%        | 40    |   -1.23%       | -1.01   | -0.34 |
| TPCH(30) | TPCH-Q5  | parquet / none / none | 4.53   | 4.56        |   -0.81% 
  |   1.88%    |   1.33%        | 40    |   -1.06%       | -2.03   | -2.24 |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 2.94   | 2.96        |   -0.87% 
  |   2.15%    |   2.04%        | 40    |   -1.52%       | -1.85   | -1.87 |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.66   | 2.70        |   -1.63% 
  |   1.95%    |   2.37%        | 40    |   -1.79%       | -2.79   | -3.37 |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 14.58  | 15.14       |   -3.72% 
  |   3.08%    |   2.98%        | 40    |   -3.44%       | -4.35   | -5.60 |
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+

Change-Id: I6debae562826621411bbcbb757978e227b395441
---
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/row-batch.h
2 files changed, 14 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/15863/1
--
To view, visit http://gerrit.cloudera.org:8080/15863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6debae562826621411bbcbb757978e227b395441
Gerrit-Change-Number: 15863
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to