Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21109 )
Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple queries ...................................................................... Patch Set 21: (5 comments) http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java: http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@99 PS21, Line 99: CalciteMetadataHandler mdHandler = optional: other steps separate the constructor and doing the actual work - it would feel more consistent to me to also add a .loadTables() or similar call here http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalcitePhysPlanCreator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalcitePhysPlanCreator.java: http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalcitePhysPlanCreator.java@50 PS21, Line 50: AuthorizationFactory authzFactory = : AuthorizationUtil.authzFactoryFrom(BackendConfig.INSTANCE); Does authorization take place in this step? Is it expected to work? I think that authorization should happen earlier, e.g. during validate, as the optimizer may optimize out tables the user has no right to access, revealing information that the user shouldn't know (e.g. that there are 0 files in the table, so it can be omitted). http://gerrit.cloudera.org:8080/#/c/21109/20/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java: http://gerrit.cloudera.org:8080/#/c/21109/20/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java@35 PS20, Line 35: ImpalaTypeSystemImpl I don't mean to solve all type system related questions in this patch, but it would be nice the add a comment about the goals of this class - what is the aim of the type system in calcite+Impala? Does it try to be as close to the old Impala planner as possible, or it wants to move closer to Hive, or wants to get more standard? Or it could be configurable? http://gerrit.cloudera.org:8080/#/c/21109/20/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java@48 PS20, Line 48: // HiveDataTypeSystemImpl in the Hive github code base. > Mentioned in above comment This is still not clear to me in the float/double case. They have fixed byte count, but the "number of digits" is a matter of formatting when converting them to strings. Checked calcite and it uses getMaxPrecision() here: https://github.com/apache/calcite/blob/152801428fc28948d8f78753c258744f7c8e253a/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L527C32-L527C52 Is the goal of setting the precision to forbid assignments from double to float? Impala works this way, while Hive seems to allow it. http://gerrit.cloudera.org:8080/#/c/21109/20/testdata/workloads/functional-query/queries/QueryTest/calcite.test File testdata/workloads/functional-query/queries/QueryTest/calcite.test: http://gerrit.cloudera.org:8080/#/c/21109/20/testdata/workloads/functional-query/queries/QueryTest/calcite.test@47 PS20, Line 47: decimal_tbl > Done This file was not updated in the last patch. -- To view, visit http://gerrit.cloudera.org:8080/21109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98 Gerrit-Change-Number: 21109 Gerrit-PatchSet: 21 Gerrit-Owner: Steve Carlin <scar...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Steve Carlin <scar...@cloudera.com> Gerrit-Comment-Date: Wed, 17 Apr 2024 15:41:48 +0000 Gerrit-HasComments: Yes