Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
......................................................................


Patch Set 4:

(37 comments)

http://gerrit.cloudera.org:8080/#/c/4414/4/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 53: enum THdfsFileFormat {
rename


Line 63: // rename this enum to not be Hdfs specific.
resolve


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 976:   tbl_def_without_col_defs:tbl_def
a 'create table' without col defs?


Line 980:     RESULT = new CreateTableStmt(tbl_def); 
trailing


Line 1033: // class doesn't inherit from CreateTableStmt.
should it?


Line 1065: primary_keys_val ::=
opt_primary_keys?


Line 1089: tbl_data_layout ::=
opt_...?


Line 1139:   {: 
fix spaces and tabs


Line 1370:   KW_PRIMARY key_ident
what's wrong with KW_KEY?


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/java/org/apache/impala/analysis/AnalysisUtils.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisUtils.java:

Line 30:   static void throwIfNotNullOrNotEmpty(Collection<?> c, String message)
this is actually 'not null *and* not empty'. you can also phrase that as 'not 
empty'.


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java:

Line 158:     if (fileFormat_ == THdfsFileFormat.KUDU) {
check at top


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

Line 236:         String.format("PRIMARY KEY must be used instead of the table 
property '%s'.",
not good: if you do that on an external table, you get this error message 
instead of 'primary key not allowed'. move this check into the next function.

alternatively, why not have a list of valid table properties per storage 
format, and then flag everything else as invalid?


Line 310:       distributeParam.setPKColumnDefMap(pkColumnDefsByName);
setPkColumn...


Line 315:   private boolean hasPrimaryKeysSpecified() {
hasPrimaryKeySpecified (there's only one, which can be a composite key)

hasPrimaryKey would express the same

when is it legal to call this (before/after analyze()?)


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
File fe/src/main/java/org/apache/impala/analysis/DistributeParam.java:

Line 121:           org.apache.impala.catalog.Type colType = colDef.getType();
does simply Type conflict with something?


Line 129:           if (colType.isStringType() && !exprType.isStringType()
this is basically looking for 'assignment compatible', and i'm sure we already 
have code somewhere to express that more succinctly.


Line 150:       builder.append(numBuckets_).append(" BUCKETS");
sprinkle some checkstates in here (on numbuckets and splitrows; or maybe a 
validate() function that does those checks, toThrift() would also benefit from 
it)


Line 200:             literal.setString_literal(expr.getStringValue());
checkstate that you're getting something valid


Line 211:   void setPKColumnDefMap(Map<String, ColumnDef> pkColumnDefByName) {
setPkC...


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

Line 74:   static class TableDefOptions {
'Options' is enough


Line 84:     // File format of the table
since this is clearly not a file format anymore, TStorageFormat?


Line 160:     fullyQualifiedTableName_ = analyzer.getFqTableName(getTblName());
stick with fq abbreviation?


Line 189:     for (ColumnDef colDef: getPartitionColumnDefs()) {
this is a bit hard to follow. partition cols aren't defined separately, they're 
declared. so then why do you need to call colDef.analyze()?


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

Line 111:   // Distribution schemes of this Kudu table. Both rang and 
hash-based distributions are
range


Line 140:     return msTbl.getParameters().get(KuduTable.KEY_TABLE_NAME);
i don't think this is worth a function call, it just makes the code harder to 
follow (extra level of indirection)


PS4, Line 175: numClusteringCols_ = 0;
> not really related to this change, but it's kind of confusing to have numCl
those should be the primary key cols


http://gerrit.cloudera.org:8080/#/c/4414/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

Line 1353:         "functional.alltypestiny", "Columns cannot be specified with 
an external " +
odd error message. i would expect the 'as select' to be the offending part.


Line 1720:     AnalyzesOk("create table tab (x int, y int, primary key (X)) " +
i thought kudu is case-sensitive


http://gerrit.cloudera.org:8080/#/c/4414/4/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

Line 6: as select * from functional.alltypestiny
shouldn't this be part of an analyzer test?


Line 30:   distribute by hash (x) into 3 buckets stored as kudu
same here, and for the other analysis error test cases in this file


Line 32: NonRecoverableException: Key column may not have type of BOOL, FLOAT, 
or DOUBLE
why wouldn't this be an analysis exception?


Line 46: NonRecoverableException: Got out-of-order key column: name: "y" type: 
INT32 is_key: true is_nullable: false cfile_block_size: 0
inscrutable error message


Line 53: NonRecoverableException: must have at least two hash buckets
error message should point out the offending clause


Line 60: NonRecoverableException: hash bucket schema components must not 
contain columns in common
same here


http://gerrit.cloudera.org:8080/#/c/4414/4/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test:

Line 1: ====
might be a good idea to point out at the top that this test contains test cases 
for what basically amount to analysis errors, but they only show up at runtime.


Line 5:   DISTRIBUTE BY RANGE SPLIT ROWS ((10), (30)) STORED AS KUDU
analyzer test?


http://gerrit.cloudera.org:8080/#/c/4414/4/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test:

Line 6:   DISTRIBUTE BY RANGE(name) SPLIT ROWS (('abc')) STORED AS KUDU
analyzer test?


-- 
To view, visit http://gerrit.cloudera.org:8080/4414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to