Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21109 )

Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple 
queries
......................................................................


Patch Set 21:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@99
PS21, Line 99:       CalciteMetadataHandler mdHandler =
optional: other steps separate the constructor and doing the actual work - it 
would feel more consistent to me to also add a .loadTables() or similar call 
here


http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalcitePhysPlanCreator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalcitePhysPlanCreator.java:

http://gerrit.cloudera.org:8080/#/c/21109/21/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalcitePhysPlanCreator.java@50
PS21, Line 50:     AuthorizationFactory authzFactory =
             :         
AuthorizationUtil.authzFactoryFrom(BackendConfig.INSTANCE);
Does authorization take place in this step? Is it expected to work?

I think that authorization should happen earlier, e.g. during validate, as the 
optimizer may optimize out tables the user has no right to access, revealing 
information that the user shouldn't know (e.g. that there are 0 files in the 
table, so it can be omitted).


http://gerrit.cloudera.org:8080/#/c/21109/20/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java:

http://gerrit.cloudera.org:8080/#/c/21109/20/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java@35
PS20, Line 35: ImpalaTypeSystemImpl
I don't mean to solve all type system related questions in this patch, but it 
would be nice the add a comment about the goals of this class - what is the aim 
of the type system in calcite+Impala? Does it try to be as close to the old 
Impala planner as possible, or it wants to move closer to Hive, or wants to get 
more standard? Or it could be configurable?


http://gerrit.cloudera.org:8080/#/c/21109/20/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java@48
PS20, Line 48:   // HiveDataTypeSystemImpl in the Hive github code base.
> Mentioned in above comment
This is still not clear to me in the float/double case. They have fixed byte 
count, but the "number of digits" is a matter of formatting when converting 
them to strings.

Checked calcite and it uses getMaxPrecision() here: 
https://github.com/apache/calcite/blob/152801428fc28948d8f78753c258744f7c8e253a/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L527C32-L527C52

Is the goal of setting the precision to forbid assignments  from double to 
float? Impala works this way, while Hive seems to allow it.


http://gerrit.cloudera.org:8080/#/c/21109/20/testdata/workloads/functional-query/queries/QueryTest/calcite.test
File testdata/workloads/functional-query/queries/QueryTest/calcite.test:

http://gerrit.cloudera.org:8080/#/c/21109/20/testdata/workloads/functional-query/queries/QueryTest/calcite.test@47
PS20, Line 47: decimal_tbl
> Done
This file was not updated in the last patch.



--
To view, visit http://gerrit.cloudera.org:8080/21109
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98
Gerrit-Change-Number: 21109
Gerrit-PatchSet: 21
Gerrit-Owner: Steve Carlin <scar...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Steve Carlin <scar...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Apr 2024 15:41:48 +0000
Gerrit-HasComments: Yes

Reply via email to