Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15797 )

Change subject: IMPALA-9688: Support create iceberg table by impala
......................................................................


Patch Set 11:

(12 comments)

Thank you for working on this. It'll be a great addition to Impala.

Please also add some tests where you create iceberg tables, where you SHOW 
CREATE Iceberg tables, etc.

http://gerrit.cloudera.org:8080/#/c/15797/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15797/11//COMMIT_MSG@19
PS11, Line 19: stored as iceberg;
How can you specify the underlying file format?

I see that Iceberg currently supports Avro, ORC, and Parquet. Does the file 
format have to be the same for all partitions, or different partitions can use 
different file formats?


http://gerrit.cloudera.org:8080/#/c/15797/11//COMMIT_MSG@11
PS11, Line 11:     create table iceberg_test(
             :         level string,
             :         event_time string,
             :         message string)
             :     partition by spec(
             :         level identity,
             :         event_time identity
             :     )
             :     stored as iceberg;
I wonder how standard is this statement. E.g. can you create Iceberg tables the 
same way in Hive?


http://gerrit.cloudera.org:8080/#/c/15797/11//COMMIT_MSG@20
PS11, Line 20: The 'identity' mean the partition type in iceberg
Please elaborate a little. E.g.:

'identity' is one of Iceberg's Partition Transforms. 'identity' means that the 
source data values are used to create partitions. Other Partition Transforms 
hash/truncate/extract from the source data values.


http://gerrit.cloudera.org:8080/#/c/15797/11/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/15797/11/be/src/service/query-options-test.cc@225
PS11, Line 225:
nit: indentation is a bit off


http://gerrit.cloudera.org:8080/#/c/15797/11/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/15797/11/common/thrift/CatalogObjects.thrift@118
PS11, Line 118: TIcebergPartitionType
TIcebergPartitionTransform? or ...TransformType?


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@228
PS11, Line 228:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@570
PS11, Line 570: Kudu
Probably you don't want to set Kudu properties here


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java:

http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java@35
PS11, Line 35:  * identity dt,
             :  * hour event_time,
             :  * day event_time,
             :  * month event_time
based on the commit message and the CUP parser the order is the opposite, e.g. 
dt identity, event_time hour, ...


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java@97
PS11, Line 97: ...
Maybe it'd be better to leave it empty?


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@61
PS11, Line 61: Partition Type
"Partition Transform" or "Partition Transform Type"


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java:

http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@236
PS11, Line 236:       case ICEBERG:
Based on the spec the underlying storage files must be Avro, ORC, or Parquet. 
All of them are splittable.


http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@96
PS11, Line 96:    * Build TIcebergPartitionType by iceberg PartitionField
             :    */
             :   public static TIcebergPartitionType 
getPartitionType(PartitionField field)
             :       throws TableLoadingException {
             :     String type = field.transform().toString().toUpperCase();
             :     if ("IDENTITY".equals(type)) {
             :       return TIcebergPartitionType.IDENTITY;
             :     } else if ("HOUR".equals(type)) {
             :       return TIcebergPartitionType.HOUR;
             :     } else if ("DAY".equals(type)) {
             :       return TIcebergPartitionType.DAY;
             :     } else if ("MONTH".equals(type)) {
             :       return TIcebergPartitionType.MONTH;
             :     } else if ("YEAR".equals(type)) {
             :       return TIcebergPartitionType.YEAR;
             :     } else if ("BUCKET".equals(type)) {
             :       return TIcebergPartitionType.BUCKET;
             :     } else if ("TRUNCATE".equals(type)) {
             :       return TIcebergPartitionType.TRUNCATE;
             :     } else {
             :       throw new TableLoadingException("Unsupported iceberg 
partition type: " +
             :           field.transform());
             :     }
             :   }
You have the same function in IcebergPartitionSpec.java. Please refactor to a 
common place



--
To view, visit http://gerrit.cloudera.org:8080/15797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d85db4c904a8c758c4cfb4f19cfbdab7e6ea284
Gerrit-Change-Number: 15797
Gerrit-PatchSet: 11
Gerrit-Owner: wangsheng <sky...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: wangsheng <sky...@163.com>
Gerrit-Comment-Date: Mon, 04 May 2020 16:59:59 +0000
Gerrit-HasComments: Yes

Reply via email to