kokila-19 commented on code in PR #5901:
URL: https://github.com/apache/hive/pull/5901#discussion_r2175191636


##########
ql/src/test/queries/clientpositive/acid_direct_delete.q:
##########
@@ -0,0 +1,18 @@
+set hive.mapred.mode=nonstrict;
+set hive.support.concurrency=true;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+set hive.acid.direct.insert.enabled=true;
+
+drop table if exists full_acid;
+
+set mapreduce.job.reduces=7;
+
+create external table ext(a int) stored as textfile;
+insert into table ext values(1),(2),(3),(4),(5),(6),(7), (8), (9), (12);
+create table full_acid(a int) stored as orc 
tblproperties("transactional"="true");
+
+insert into table full_acid select * from ext where a != 3 and a <=7 group by 
a;

Review Comment:
   Without `group by `, what we have is the following with the same bucket id. 
Deleting the rows with same bucketid will not trigger this issue .
   
   ```
   {"writeid":1,"bucketid":536870912,"rowid":0} 1
   {"writeid":1,"bucketid":536870912,"rowid":1} 2
   {"writeid":1,"bucketid":536870912,"rowid":2} 4
   {"writeid":1,"bucketid":536870912,"rowid":3} 5
   {"writeid":1,"bucketid":536870912,"rowid":4} 6
   {"writeid":1,"bucketid":536870912,"rowid":5} 7
   {"writeid":2,"bucketid":536870912,"rowid":0} 8
   {"writeid":2,"bucketid":536870912,"rowid":1} 9
   {"writeid":2,"bucketid":536870912,"rowid":2} 12
   ```
   To repro this issue, we need to delete two rows with different bucket id's 
where bucket num of 1st row  > 2nd row
   With group by , we get 
   ```
   {"writeid":1,"bucketid":536936448,"rowid":0} 1
   {"writeid":1,"bucketid":537001984,"rowid":0} 7
   {"writeid":1,"bucketid":537067520,"rowid":0} 4
   {"writeid":1,"bucketid":537067520,"rowid":1} 6
   {"writeid":1,"bucketid":537198592,"rowid":0} 5
   {"writeid":1,"bucketid":537264128,"rowid":0} 2
   {"writeid":2,"bucketid":536936448,"rowid":0} 9
   {"writeid":2,"bucketid":537133056,"rowid":0} 12
   {"writeid":2,"bucketid":537264128,"rowid":0} 8
   ```
   
   where in the qtest , there is a deletion of 2(bucket id : 537264128) and 12 
(bucketid : 537133056) where their bucket number is 6 and 4 respectively which 
will trigger this particular scenario where we get ArrayIndexOutOfBounds 
Exception for index 6 as outPathCommitted Array is shortened based on last 
bucket number which will be 4. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to