Alex Behm has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
......................................................................


Patch Set 7:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/4047/7//COMMIT_MSG
Commit Message:

Line 18:   select_stmt
More precisely this is a query_stmt because you can have a union/values/select 
stmt.


http://gerrit.cloudera.org:8080/#/c/4047/7/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 727: upsert_stmt ::=
we also need to support plan hints for upsert


Line 729:     LPAREN opt_ident_list:col_perm RPAREN opt_query_stmt:query
Why is the query stmt optional here?


http://gerrit.cloudera.org:8080/#/c/4047/7/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

Line 83:     insertStmt_ = new InsertStmt(
let's clean up this parameter hell with static createInsert/Upsert() helpers


http://gerrit.cloudera.org:8080/#/c/4047/7/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 48:  * Representation of a single insert (or upsert) statement, including 
the select statement
remove parens


Line 84:   // explicitly mentioned, will be assigned NULL values for inserted 
rows or left as is
There is some minor ambiguity with 'inserted' and 'updated'. It's not clear 
whether 'updated' only refers to those rows that already existed before an 
upsert. I think this can be simplified/clarified.

Maybe something like:

... will be assigned NULL values for INSERTs and left unassigned for UPSERTs.


Line 134:   public InsertStmt(WithClause withClause, TableName targetTable, 
boolean overwrite,
To more easily distinguish the upsert/insert cases let's make the constructor 
protected and add two static functions for creating an insert and upsert stmt 
respectively.

public static InsertStmt createInsert(...);
public static InsertStmt createUpsert(...);


Line 193:         throw new AnalysisException("UPSERT is not compatible with 
OVERWRITE");
These cases don't parse, so they should be Preconditions in the c'tor


Line 379:         if (partitionKeyValues_ != null && 
!partitionKeyValues_.isEmpty()) {
Not possible to parse this. Add Preconditions check in c'tor instead


Line 683:     if (isUpsert_) throw new AnalysisException("UPSERT does not 
support plan hints.");
also doesn't parse, but I think we should just allow plan hints (some useful 
ones are coming pretty soon)


http://gerrit.cloudera.org:8080/#/c/4047/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

Line 2454:     AnalyzesOk("upsert into table functional_kudu.testtbl values(1, 
'a', 1)");
We should either wrap these Kudu tests into:

if (RuntimeEnv.INSTANCE.isKuduSupported()) {
// tests go here
}

or we should move these tests into more Kudu-specific places where we can do 
that check wholesale

You can look at the uses of RuntimeEnv.INSTANCE.isKuduSupported()) to find a 
few interesting places.

The problem with leaving this as is is that anyone running a system that does 
not support Kudu will be unable run run tests (I believe they will just hang)

I think it might make sense to just add a new AnalyzeUpsertStmtTest


Line 2848:     AnalysisError("insert into functional.alltypes_view 
partition(year, month) " +
add same test for upsert


Line 3493:         "All key columns must be specified for Kudu tables. Missing 
columns are: id");
Somewhat misleading error msg because upsert is only supported for Kudu tables. 
Maybe rephrase to something like:

"All primary key columns must be specified for UPSERT. Missing columns are: "


http://gerrit.cloudera.org:8080/#/c/4047/7/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 573: # simple upsert with select
let's move this into a separate .test file that is run with     
Assume.assumeTrue(RuntimeEnv.INSTANCE.isKuduSupported()) like the other Kudu 
planner tests


Line 594: upsert into table functional_kudu.testtbl
also add a test where the primary-key columns are fed from the result of an 
inline view (let me know if this needs clarification)


http://gerrit.cloudera.org:8080/#/c/4047/7/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test:

Line 306: upsert into table tdata (valb) values (true), (false)
remove, covered by analysis tests


-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to