Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20104 )
Change subject: IMPALA-11842: Improve memory estimation for Aggregate ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/20104/3/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/20104/3/be/src/service/query-options.h@298 PS3, Line 298: QUERY_OPT_FN(agg_bytes_correlation_factor, AGG_BYTES_CORRELATION_FACTOR, \ > Maybe better to use mem instead of bytes in these option names Ack. Will change in next patch set. http://gerrit.cloudera.org:8080/#/c/20104/3/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: http://gerrit.cloudera.org:8080/#/c/20104/3/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@572 PS3, Line 572: if (useMaxNdv && result > maxNdv) result = maxNdv; > Should max be divided if partition is true? I think we should be careful not to severely underestimate the final number. maxNdv is not always comes from the partition column. Also, before this, result is already divided in line 570. If maxNdv is already less than that divided result, I think it should go with the maxNdv itself. -- To view, visit http://gerrit.cloudera.org:8080/20104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4b4b2e519ee89f0a13fdb62d0471ee4047f6421 Gerrit-Change-Number: 20104 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Thu, 29 Jun 2023 16:52:56 +0000 Gerrit-HasComments: Yes