[ https://issues.apache.org/jira/browse/HIVE-21172?focusedWorklogId=695795&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695795 ]
ASF GitHub Bot logged work on HIVE-21172: ----------------------------------------- Author: ASF GitHub Bot Created on: 14/Dec/21 13:59 Start Date: 14/Dec/21 13:59 Worklog Time Spent: 10m Work Description: kasakrisz commented on a change in pull request #2857: URL: https://github.com/apache/hive/pull/2857#discussion_r768692003 ########## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java ########## @@ -1711,13 +1713,13 @@ public void testMajorCompactionAfterTwoMergeStatements() throws Exception { // Verify contents of bucket files. List<String> expectedRsBucket0 = Arrays.asList("{\"writeid\":1,\"bucketid\":536870912,\"rowid\":3}\t4\tvalue_4", - "{\"writeid\":2,\"bucketid\":536870912,\"rowid\":0}\t6\tvalue_6", - "{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3", - "{\"writeid\":3,\"bucketid\":536870912,\"rowid\":0}\t8\tvalue_8", - "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":0}\t5\tnewestvalue_5", - "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":1}\t7\tnewestvalue_7", - "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":2}\t1\tnewestvalue_1", - "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":3}\t2\tnewestvalue_2"); + "{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3", Review comment: iiuc this test was created to verify the order of records in the bucket files after major compaction. Based on description of https://issues.apache.org/jira/browse/HIVE-25257 It should be ordered by originalTransactionId, bucketProperty, rowId. Unfortunately originalTransactionId can not be queried so I debugged the test and stop the execution before this assert. Then I dumped the orc file created on my local fs: ``` java -jar orc-tools-1.6.5/orc-tools-1.6.5-uber.jar data ./itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-1639489721524_1338883398/warehouse/comp_and_merge_test/base_0000003_v0000014/bucket_00000 Processing data file itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-1639489721524_1338883398/warehouse/comp_and_merge_test/base_0000003_v0000014/bucket_00000 [length: 808] {"operation":0,"originaltransaction":1,"bucket":536870912,"rowid":3,"currenttransaction":1,"row":{"id":4,"value":"value_4"}} {"operation":0,"originaltransaction":2,"bucket":536870913,"rowid":2,"currenttransaction":2,"row":{"id":3,"value":"newvalue_3"}} {"operation":0,"originaltransaction":2,"bucket":536870914,"rowid":0,"currenttransaction":2,"row":{"id":6,"value":"value_6"}} {"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":0,"currenttransaction":3,"row":{"id":1,"value":"newestvalue_1"}} {"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":1,"currenttransaction":3,"row":{"id":2,"value":"newestvalue_2"}} {"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":2,"currenttransaction":3,"row":{"id":5,"value":"newestvalue_5"}} {"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":3,"currenttransaction":3,"row":{"id":7,"value":"newestvalue_7"}} {"operation":0,"originaltransaction":3,"bucket":536870914,"rowid":0,"currenttransaction":3,"row":{"id":8,"value":"value_8"}} ________________________________________________________________________________________________________________________ ``` Order seems to be valid. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 695795) Time Spent: 40m (was: 0.5h) > DEFAULT keyword handling in MERGE UPDATE clause issues > ------------------------------------------------------ > > Key: HIVE-21172 > URL: https://issues.apache.org/jira/browse/HIVE-21172 > Project: Hive > Issue Type: Sub-task > Components: SQL, Transactions > Affects Versions: 4.0.0 > Reporter: Eugene Koifman > Assignee: Krisztian Kasa > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > once HIVE-21159 lands, enable {{HiveConf.MERGE_SPLIT_UPDATE}} and run these > tests. > TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats] > mvn test -Dtest=TestMiniLlapLocalCliDriver > -Dqfile=insert_into_default_keyword.q > Merge is rewritten as a multi-insert. When Update clause has DEFAULT, it's > not properly replaced with a value in the muli-insert - it's treated as a > literal > {noformat} > INSERT INTO `default`.`acidTable` -- update clause(insert part) > SELECT `t`.`key`, `DEFAULT`, `t`.`value` > WHERE `t`.`key` = `s`.`key` AND `s`.`key` > 3 AND NOT(`s`.`key` < 3) > {noformat} > See {{LOG.info("Going to reparse <" + originalQuery + "> as \n<" + > rewrittenQueryStr.toString() + ">");}} in hive.log > {{MergeSemanticAnalyzer.replaceDefaultKeywordForMerge()}} is only called in > {{handleInsert}} but not {{handleUpdate()}}. Why does issue only show up with > {{MERGE_SPLIT_UPDATE}}? > Once this is fixed, HiveConf.MERGE_SPLIT_UPDATE should be true by default -- This message was sent by Atlassian Jira (v8.20.1#820001)