Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24017 )

Change subject: IMPALA-14484: Calcite planner: Add column stats to union slot 
descriptors
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/24017/1/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q38.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q38.test:

http://gerrit.cloudera.org:8080/#/c/24017/1/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q38.test@46
PS1, Line 46: |  mem-estimate=50.00GB mem-reservation=34.00MB 
spill-buffer=2.00MB thread-reservation=0
> This is a substantial change. The row-size only grew 33%.
Yeah, heh, that is a big change :)

But the new number is more in line with what it should be.  A couple of 
observations:
1) The cardinality of this aggregate is 13.15G which hasn't changed, so 128MB 
didn't really make sense.
2) this is more in line with 07:AGGREGATE which is also 50G
3) I stepped through the debugger and found the major difference is caused here:

https://github.com/scarlin-cloudera/impala/blob/master/fe/src/main/java/org/apache/impala/planner/PlanFragment.java#L428

Before this change, the ndv for each expression was showing up as "-1".  This 
change compounded into an erroneous estimation.

After this change, it has 3 fields in the group by with the following ndvs 
which are multiplied together:
last name: 4963
first name: 5042
date: 67399

Multiplying this together gives 16,865,555,236,954, way over the 50.00 GB.  The 
50GB comes from a later Math.max() method which ensures that the memory 
estimate will not be over 50.00GB



--
To view, visit http://gerrit.cloudera.org:8080/24017
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0585fafa45658dffffdbf6410b028f03304b6e9
Gerrit-Change-Number: 24017
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Steve Carlin <[email protected]>
Gerrit-Comment-Date: Wed, 04 Mar 2026 01:33:12 +0000
Gerrit-HasComments: Yes

Reply via email to