[
https://issues.apache.org/jira/browse/IMPALA-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Csaba Ringhofer updated IMPALA-14757:
-------------------------------------
Description:
set num_nodes=1;
with s as (select l_shipdate, l_orderkey, max(l_orderkey) over() maxkey from
tpch_parquet.lineitem) select * from s where maxkey = l_orderkey;
summary:
{code}
+--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
| Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows |
Peak Mem | Est. Peak Mem | Detail |
+--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
| F00:ROOT | 1 | 1 | 22.67us | 22.67us | | |
4.02 MB | 21.75 MB | |
| 02:SELECT | 1 | 1 | 24.14ms | 24.14ms | 2 | 600.12K |
24.00 KB | 0 B | |
| 01:ANALYTIC | 1 | 1 | 723.74ms | 723.74ms | 6.00M | 6.00M |
178.09 MB | 4.00 MB | |
| 00:SCAN HDFS | 1 | 1 | 10.94ms | 10.94ms | 6.00M | 6.00M |
29.22 MB | 160.00 MB | tpch_parquet.lineitem |
+--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
{code}
The analytic node consumer 178MB vs the estimated 4MB.
Note that this query would be more efficient if it was rewritten to use scalar
subquery the get the max.
was:
set num_nodes=1;
with s as (select l_shipdate, l_orderkey, max(l_orderkey) over() maxkey from
tpch_parquet.lineitem) select * from s where maxkey = l_orderkey;
summary:
{code}
+--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
| Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows |
Peak Mem | Est. Peak Mem | Detail |
+--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
| F00:ROOT | 1 | 1 | 22.67us | 22.67us | | |
4.02 MB | 21.75 MB | |
| 02:SELECT | 1 | 1 | 24.14ms | 24.14ms | 2 | 600.12K |
24.00 KB | 0 B | |
| 01:ANALYTIC | 1 | 1 | 723.74ms | 723.74ms | 6.00M | 6.00M |
178.09 MB | 4.00 MB | |
| 00:SCAN HDFS | 1 | 1 | 10.94ms | 10.94ms | 6.00M | 6.00M |
29.22 MB | 160.00 MB | tpch_parquet.lineitem |
+--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
{code}
The analytic node consumer 178MB vs the estimated 4MB.
> Analytic functions' mem usage can be underestimated
> ---------------------------------------------------
>
> Key: IMPALA-14757
> URL: https://issues.apache.org/jira/browse/IMPALA-14757
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Csaba Ringhofer
> Priority: Major
>
> set num_nodes=1;
> with s as (select l_shipdate, l_orderkey, max(l_orderkey) over() maxkey
> from tpch_parquet.lineitem) select * from s where maxkey = l_orderkey;
> summary:
> {code}
> +--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
> | Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows |
> Peak Mem | Est. Peak Mem | Detail |
> +--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
> | F00:ROOT | 1 | 1 | 22.67us | 22.67us | | |
> 4.02 MB | 21.75 MB | |
> | 02:SELECT | 1 | 1 | 24.14ms | 24.14ms | 2 | 600.12K |
> 24.00 KB | 0 B | |
> | 01:ANALYTIC | 1 | 1 | 723.74ms | 723.74ms | 6.00M | 6.00M |
> 178.09 MB | 4.00 MB | |
> | 00:SCAN HDFS | 1 | 1 | 10.94ms | 10.94ms | 6.00M | 6.00M |
> 29.22 MB | 160.00 MB | tpch_parquet.lineitem |
> +--------------+--------+-------+----------+----------+-------+------------+-----------+---------------+-----------------------+
> {code}
> The analytic node consumer 178MB vs the estimated 4MB.
> Note that this query would be more efficient if it was rewritten to use
> scalar subquery the get the max.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]