Mahesh Raju Somalaraju created HIVE-27198:
---------------------------------------------
Summary: Delete directly aborted transactions instead of select
and loading ids
Key: HIVE-27198
URL: https://issues.apache.org/jira/browse/HIVE-27198
Project: Hive
Issue Type: Improvement
Reporter: Mahesh Raju Somalaraju
Assignee: Mahesh Raju Somalaraju
in cleaning the aborted transaction , we can directly deletes the txns instead
of selecting and process.
method name:
cleanEmptyAbortedAndCommittedTxns
Code:
String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " +
"\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
" (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " +
TxnStatus.COMMITTED + ") AND "
+ " \"TXN_ID\" < " + lowWaterMark;
proposed code:
String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " +
"\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
" (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " +
TxnStatus.COMMITTED + ") AND "
+ " \"TXN_ID\" < " + lowWaterMark;
the select needs to be eliminated and the delete should work with the where
clause instead of the built in clause
we can see no reason for loading the ids into memory and then generate a huge
sql
Bathcing is also not necessary here, we can deletes the records directly
--
This message was sent by Atlassian Jira
(v8.20.10#820010)