zabetak commented on code in PR #5781: URL: https://github.com/apache/hive/pull/5781#discussion_r2073771610
########## ql/src/java/org/apache/hadoop/hive/ql/Context.java: ########## @@ -361,8 +361,16 @@ private DestClausePrefix getMergeDestClausePrefix(ASTNode curNode) { assert insert != null && insert.getType() == HiveParser.TOK_INSERT; ASTNode query = (ASTNode) insert.getParent(); assert query != null && query.getType() == HiveParser.TOK_QUERY; - - int tokFromIdx = query.getFirstChildWithType(HiveParser.TOK_FROM).getChildIndex(); + ASTNode from = (ASTNode) query.getFirstChildWithType(HiveParser.TOK_FROM); + + if (from == null) { + // We are here when TOK_FROM is missing from the AST. + // This can happen for merge queries with a predicate like `<joining_column> is null` + // in the matched clause. + return DestClausePrefix.MERGE; + } Review Comment: I understand that the FROM clause may be missing but why do we have to return `DestClausePrefix.MERGE` and not one of the other constants? ########## iceberg/iceberg-handler/src/test/queries/positive/merge_with_null_check_on_joining_col.q: ########## @@ -0,0 +1,24 @@ + +create table target(a int, b int, c int) stored by iceberg tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write'); +create table source(a int, b int, c int) stored by iceberg tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write'); + +-- empty plan as joining column cannot be null for matched clause +explain cbo +merge into target as t using source as s on t.a = s.a and t.b = s.b +when matched and t.a is null then delete; + +explain cbo +merge into target as t using source as s on t.a = s.a and t.b = s.b +when matched and t.a is null then update set b = t.b + 10; Review Comment: Does it make sense to add tests where the merge statement has multiple branches (delete, update, insert) and only one of them is false each time? ########## iceberg/iceberg-handler/src/test/queries/positive/merge_with_null_check_on_joining_col.q: ########## @@ -0,0 +1,24 @@ + +create table target(a int, b int, c int) stored by iceberg tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write'); +create table source(a int, b int, c int) stored by iceberg tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write'); + +-- empty plan as joining column cannot be null for matched clause +explain cbo +merge into target as t using source as s on t.a = s.a and t.b = s.b +when matched and t.a is null then delete; + +explain cbo +merge into target as t using source as s on t.a = s.a and t.b = s.b +when matched and t.a is null then update set b = t.b + 10; +-------------------------------------------------------------------- + + -- non empty plans for these queries +explain cbo +merge into target as t using source as s on t.a = s.a and t.b = s.b +when not matched and t.a is null then insert values (1, 2, 3); + +explain cbo +merge into target as t using source as s on t.a = s.a and t.b = s.b +when matched and t.a is null then delete +when matched then update set b = t.b + 10 +when not matched then insert values (1, 2, 3); Review Comment: I assume we already have tests with MERGE statements and non-empty branches so if the above queries do not add new coverage then I would suggest to remove them. ########## iceberg/iceberg-handler/src/test/queries/positive/merge_with_null_check_on_joining_col.q: ########## @@ -0,0 +1,24 @@ + +create table target(a int, b int, c int) stored by iceberg tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write'); +create table source(a int, b int, c int) stored by iceberg tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write'); Review Comment: It seems that the problem is not specific to iceberg but affects all kinds of merge statements. If that's the case it may be good to write the test case using regular ACID tables and the more common `TestMiniLlapLocalCliDriver`. ########## ql/src/java/org/apache/hadoop/hive/ql/Context.java: ########## @@ -361,8 +361,16 @@ private DestClausePrefix getMergeDestClausePrefix(ASTNode curNode) { assert insert != null && insert.getType() == HiveParser.TOK_INSERT; ASTNode query = (ASTNode) insert.getParent(); assert query != null && query.getType() == HiveParser.TOK_QUERY; - - int tokFromIdx = query.getFirstChildWithType(HiveParser.TOK_FROM).getChildIndex(); + ASTNode from = (ASTNode) query.getFirstChildWithType(HiveParser.TOK_FROM); Review Comment: According to the comment a few line above the code here assumes that the AST has a certain shape. I am not sure to what extend other parts of the code rely on this structure. Have you checked if there is a more intuitive fix when going from CBO to AST? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org