Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/21109 )
Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple queries ...................................................................... Patch Set 16: (7 comments) A few more comments based on a second pass. http://gerrit.cloudera.org:8080/#/c/21109/16/bin/set-classpath.sh File bin/set-classpath.sh: http://gerrit.cloudera.org:8080/#/c/21109/16/bin/set-classpath.sh@62 PS16, Line 62: FE nit: use expanded form, not acronym http://gerrit.cloudera.org:8080/#/c/21109/16/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/21109/16/bin/start-impala-cluster.py@182 PS16, Line 182: U nit: lowercase 'u'. http://gerrit.cloudera.org:8080/#/c/21109/16/bin/start-impala-cluster.py@183 PS16, Line 183: "instead of JniFrontend.") JniFrontend is an internal class, shouldn't be mentioned in a user message. How about 'If true, use the Calcite planner for query optimization instead of Impala planner' http://gerrit.cloudera.org:8080/#/c/21109/16/fe/src/main/java/org/apache/impala/planner/PlannerContext.java File fe/src/main/java/org/apache/impala/planner/PlannerContext.java: http://gerrit.cloudera.org:8080/#/c/21109/16/fe/src/main/java/org/apache/impala/planner/PlannerContext.java@97 PS16, Line 97: // Constructor useful for an external planner module nit: this comment can be removed since the purpose of this patch is to make Calcite planner an internal module. http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java: http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@95 PS16, Line 95: List<Expr> scanOutputExprs = new ArrayList<>(Collections.nCopies(totalCols, null)); For wide tables where we are only needing a few columns projected, we will end up with a long list with mostly Nulls. A LinkedHashMap (preserves Insertion order) where the key is position and value is the SlotRef would be better suited despite the cpu cost of hashing. In general, in a query planner, memory is the most precious commodity since the plan search space can be large, so anything we can do to reduce memory footprint would be preferred. That said, I would be ok if this is done in a subsequent patch, just keep track through a Jira. http://gerrit.cloudera.org:8080/#/c/21109/15/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java: http://gerrit.cloudera.org:8080/#/c/21109/15/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@52 PS15, Line 52: an experimental nit: remove experimental http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java: http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@148 PS16, Line 148: if (e == null) { Why is this checking for null here ? e is already being referenced above. -- To view, visit http://gerrit.cloudera.org:8080/21109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98 Gerrit-Change-Number: 21109 Gerrit-PatchSet: 16 Gerrit-Owner: Steve Carlin <scar...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Steve Carlin <scar...@cloudera.com> Gerrit-Comment-Date: Sat, 30 Mar 2024 01:56:57 +0000 Gerrit-HasComments: Yes