kokila-19 commented on code in PR #5901: URL: https://github.com/apache/hive/pull/5901#discussion_r2175191636
########## ql/src/test/queries/clientpositive/acid_direct_delete.q: ########## @@ -0,0 +1,18 @@ +set hive.mapred.mode=nonstrict; +set hive.support.concurrency=true; +set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; +set hive.acid.direct.insert.enabled=true; + +drop table if exists full_acid; + +set mapreduce.job.reduces=7; + +create external table ext(a int) stored as textfile; +insert into table ext values(1),(2),(3),(4),(5),(6),(7), (8), (9), (12); +create table full_acid(a int) stored as orc tblproperties("transactional"="true"); + +insert into table full_acid select * from ext where a != 3 and a <=7 group by a; Review Comment: Without `group by `, what we have is the following with the same bucket id. Deleting the rows with same bucketid will not trigger this issue . ``` {"writeid":1,"bucketid":536870912,"rowid":0} 1 {"writeid":1,"bucketid":536870912,"rowid":1} 2 {"writeid":1,"bucketid":536870912,"rowid":2} 4 {"writeid":1,"bucketid":536870912,"rowid":3} 5 {"writeid":1,"bucketid":536870912,"rowid":4} 6 {"writeid":1,"bucketid":536870912,"rowid":5} 7 {"writeid":2,"bucketid":536870912,"rowid":0} 8 {"writeid":2,"bucketid":536870912,"rowid":1} 9 {"writeid":2,"bucketid":536870912,"rowid":2} 12 ``` To repro this issue, we need to delete two rows with different bucket id's where bucket num of 1st row > 2nd row With group by , we get ``` {"writeid":1,"bucketid":536936448,"rowid":0} 1 {"writeid":1,"bucketid":537001984,"rowid":0} 7 {"writeid":1,"bucketid":537067520,"rowid":0} 4 {"writeid":1,"bucketid":537067520,"rowid":1} 6 {"writeid":1,"bucketid":537198592,"rowid":0} 5 {"writeid":1,"bucketid":537264128,"rowid":0} 2 {"writeid":2,"bucketid":536936448,"rowid":0} 9 {"writeid":2,"bucketid":537133056,"rowid":0} 12 {"writeid":2,"bucketid":537264128,"rowid":0} 8 ``` where in the qtest , there is a deletion of 2(bucket id : 537264128) and 12 (bucketid : 537133056) where their bucket number is 6 and 4 respectively which will trigger this particular scenario where we get ArrayIndexOutOfBounds Exception for index 6 as outPathCommitted Array is shortened based on last bucket number which will be 4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org