[ https://issues.apache.org/jira/browse/HIVE-21160?focusedWorklogId=772732&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772732 ]
ASF GitHub Bot logged work on HIVE-21160: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/May/22 09:03 Start Date: 20/May/22 09:03 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #2855: URL: https://github.com/apache/hive/pull/2855#discussion_r877912792 ########## ql/src/test/results/clientpositive/llap/acid_direct_update_delete_partitions.q.out: ########## @@ -148,30 +156,50 @@ POSTHOOK: Input: default@test_update_part@c=11 POSTHOOK: Input: default@test_update_part@c=22 POSTHOOK: Input: default@test_update_part@c=33 POSTHOOK: Input: default@test_update_part@c=__HIVE_DEFAULT_PARTITION__ +POSTHOOK: Output: default@test_update_part +POSTHOOK: Output: default@test_update_part@c=11 POSTHOOK: Output: default@test_update_part@c=11 POSTHOOK: Output: default@test_update_part@c=22 +POSTHOOK: Output: default@test_update_part@c=22 POSTHOOK: Output: default@test_update_part@c=33 POSTHOOK: Output: default@test_update_part@c=__HIVE_DEFAULT_PARTITION__ +POSTHOOK: Output: default@test_update_part@c=__HIVE_DEFAULT_PARTITION__ +POSTHOOK: Lineage: test_update_part PARTITION(c=11).a SIMPLE [] Review Comment: With early split update we end up with a multi insert statement: all of the branches inserts to the same table. Hive collects the lineage info to a List not a Set and while traversing all insert branches. This leads to duplicates. ########## itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java: ########## @@ -534,7 +534,7 @@ public void testUpdateSomeColumnsUsed() throws Exception { assertEquals(1, inputs.size()); tableObj = inputs.get(0); assertEquals(2, tableObj.getColumns().size()); - assertEquals("j", tableObj.getColumns().get(0)); + assertEquals("j", tableObj.getColumns().get(0 )); Review Comment: fixed. Issue Time Tracking ------------------- Worklog Id: (was: 772732) Time Spent: 1h 20m (was: 1h 10m) > Rewrite Update statement as Multi-insert and do Update split early > ------------------------------------------------------------------ > > Key: HIVE-21160 > URL: https://issues.apache.org/jira/browse/HIVE-21160 > Project: Hive > Issue Type: Sub-task > Components: Transactions > Affects Versions: 3.0.0 > Reporter: Eugene Koifman > Assignee: Krisztian Kasa > Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)