Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/22425 )
Change subject: IMPALA-12959: Calcite planner: Implement count star optimization... ...................................................................... Patch Set 10: Code-Review+1 (5 comments) This looks good to me, just a couple nits http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java: http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@148 PS10, Line 148: // TODO: Should this be a map instead of a list? See IMPALA-12961 for details. Nit: Does this comment still apply? http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@161 PS10, Line 161: The : // modular arithmetic provides the correct position number for Impala. Nit: This comment about modular arithmetic is out of date. http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@183 PS10, Line 183: * XXX: Reviewer note: This code was moved from CalciteTable, so it's not new : * for this review. Before this code gets committed, this comment should be : * removed. Nit: Reminder to remove this later http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@253 PS10, Line 253: if (context.inputRefs_ == null) { : return getRowType().getFieldNames(); : } Somewhat unrelated: In the ImpalaFilterRel, we have some logic that avoids reading all of the table columns when it feeds into count(*). Could we have similar logic here? I'm not sure it matters for the count star optimization, because we don't actually read the data. It might matter for text tables. e.g. "select count(*) from functional.alltypes" Right now, the Calcite version of this has an 89 byte tuple. The regular planner has a 0 byte tuple. http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaPlanRel.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaPlanRel.java: http://gerrit.cloudera.org:8080/#/c/22425/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaPlanRel.java@101 PS10, Line 101: */ : public static boolean canPassThroughParentAggregate(ImpalaPlanRel planRel) { : switch (getRelNodeType((RelNode) planRel)) { : case FILTER: : case PROJECT: : return true; : default: : return false; : } We'll have to think about unions. It obviously can't do things like the count star optimization, but some of the information is useful and can pass through. (But that doesn't impact this review.) For example: select count(*) from (select * from functional.alltypes union all select * from functional.alltypestiny); The regular planner doesn't change this into a partition key scan, but it could: select distinct year from (select year from functional.alltypes union all select year from functional.alltypessmall); -- To view, visit http://gerrit.cloudera.org:8080/22425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I975beefedd2cceb34dad0f93343a46d1b7094c13 Gerrit-Change-Number: 22425 Gerrit-PatchSet: 10 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Fri, 25 Apr 2025 23:18:53 +0000 Gerrit-HasComments: Yes
