[ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795673&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795673
 ]

ASF GitHub Bot logged work on HIVE-26414:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Jul/22 14:21
            Start Date: 27/Jul/22 14:21
    Worklog Time Spent: 10m 
      Work Description: SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r931126498


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table destinationTable = ctx.getDestinationTable();
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), 
destinationTable.getTableName(), CompactionType.MAJOR);
+          
rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, 
destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   > btw, would it be hard to create a completionHook similar to Iceberg one?
   
   We could create one but it would include failures only within Query 
execution.
   Anything done after query execution (post execution activities like 
releasing locks) will not be within its scope, which is why I disregarded the 
Hook approach.
   
   The hooks are used as part of finally block here - 
   
https://github.com/apache/hive/blob/b197ed86029f07696e326acb5878f86c286e9e1a/ql/src/java/org/apache/hadoop/hive/ql/Executor.java#L118
   
   Cleanup will then be dependent on a HiveConf - `hive.query.lifetime.hooks`. 





Issue Time Tracking
-------------------

    Worklog Id:     (was: 795673)
    Time Spent: 6h 20m  (was: 6h 10m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-26414
>                 URL: https://issues.apache.org/jira/browse/HIVE-26414
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sourabh Badhya
>            Assignee: Sourabh Badhya
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to