Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19483 )

Change subject: IMPALA-11908: Parser change for Iceberg metadata querying
......................................................................


Patch Set 6:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3341
PS6, Line 3341: addMetadataVirtualTable
nit: I feel that the name should indicate that we don't add a metadata table in 
every case. On the callsite I had the impression that a metadata table is added 
regardless we refer to a metadata table or not.
Name proposal: tryAddingMetadataTable(...)


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3342
PS6, Line 3342:     if 
(IcebergMetadataTable.isIcebergMetadataTable(tblRefPath)) {
nit: ususally my preference is exiting early if some conditions are not met 
instead of putting the whole body of a function inside an if. This reduces deep 
indentations.

if (not metadata table) return;
...


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/IcebergMetadataTableRef.java
File fe/src/main/java/org/apache/impala/analysis/IcebergMetadataTableRef.java:

http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/IcebergMetadataTableRef.java@37
PS6, Line 37: (IcebergMetadataTable)resolvedPath.getRootTable();
Precondition to verify this conversion will work well?


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/TableName.java
File fe/src/main/java/org/apache/impala/analysis/TableName.java:

http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/TableName.java@46
PS6, Line 46:   public TableName(String db, String tbl) {
This constructor could call the other one like:
this(db, tbl, null) and then we can avoid code duplication between these 
constructors.


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/TableName.java@52
PS6, Line 52:     this.vTbl_ = "";
In Path.getCandidateTables() you pass null for vTbl_ if the table is not 
virtual but in this constructor you set it to empty string. To reach 
consistency I think we should pick one of those 2 approaches. I'd vote for 
setting it null if the table is not virtual.


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/analysis/TableName.java@126
PS6, Line 126:       if (!vTbl_.isEmpty()) {
This could cause a NPE when vTbl_ is null.


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java
File 
fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java:

http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@47
PS6, Line 47: tblRefPath
If I'm not mistaken this path is only used to get the name of the metadata 
table. Would it make sense then to pass the metadata table name into this 
constructor instead of the path?


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@51
PS6, Line 51: get
I miss a few Precondition checks on the inputs here, e.g. to verify that 
'baseTable' is an instance of FeIcebergTable, or to check the number of items 
in 'tblRefPath' before we access the items.
I guess it;s expected to call isIcebergMetadataTable() on the parameter at the 
callsite but you don't have any garantee that any future implementaton will 
also call it.


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@51
PS6, Line 51: metadataTableTypeString
No need to introduce a variable for this. You can use 'tblRefPath.get(2)' in 
the below line.


http://gerrit.cloudera.org:8080/#/c/19483/6/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@52
PS6, Line 52: MetadataTableType.valueOf
I think MetadataTableType.from() is the preferred way to do the conversion from 
String that is a wrapper around MetadataTableType.valueOf(). It also takes care 
of the IllegalArgumentException that you didn't do here. You just have to check 
for null result.


http://gerrit.cloudera.org:8080/#/c/19483/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test:

http://gerrit.cloudera.org:8080/#/c/19483/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@1
PS6, Line 1: ====
Would it make sense to include all the metadata table types here not just 
'history' and 'manifests'?


http://gerrit.cloudera.org:8080/#/c/19483/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@2
PS6, Line 2: ---- QUERY
There are some metadata tables that have nested type columns, e.g. 
partitions.partition is a struct. Could you check if you can reference the 
members of these struct columns in the select list?

Also, manifests.partition_summaries seems nested type too. Would be nice to 
exercise it a little bit early to see if we have any gaps with these types.


http://gerrit.cloudera.org:8080/#/c/19483/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@26
PS6, Line 26: join
I understand this test is only for checking the join syntax, but for later 
usage I think we could have a more meaningful join here instead of joining a 
table to itself. E.g. join history with snapshots?


http://gerrit.cloudera.org:8080/#/c/19483/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@40
PS6, Line 40: ====
I just bumped into another use case wrt metadata table: you can also make 'time 
travel' with metadata tables. E.g. this is valid in Spark:
SELECT * from db.tbl.files VERSION AS OF <snapshot_id>

We might want to put this into a follow-up patch, or include here, it's up to 
you.



--
To view, visit http://gerrit.cloudera.org:8080/19483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0b5db884b5f3fecbd132fcb2c2cbd6c622ff965b
Gerrit-Change-Number: 19483
Gerrit-PatchSet: 6
Gerrit-Owner: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Anonymous Coward <lipeng...@apache.org>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Mar 2023 11:11:02 +0000
Gerrit-HasComments: Yes

Reply via email to