[ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=794187&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794187
 ]

ASF GitHub Bot logged work on HIVE-26414:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jul/22 11:41
            Start Date: 22/Jul/22 11:41
    Worklog Time Spent: 10m 
      Work Description: SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927568434


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   Yes, the cleanup is for cases when there is no concurrent CTAS and a single 
query fails. But if we disable exclusive locking and perform concurrent CTAS 
operations and let's say the first query fails, then cleanup is triggered by 
the first query on the same location and the second query will write to the 
same location. 
   
   This is the situation I want to avoid which is why perform cleanup only when 
the exclusive locking on CTAS is enabled.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 794187)
    Time Spent: 4h  (was: 3h 50m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-26414
>                 URL: https://issues.apache.org/jira/browse/HIVE-26414
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sourabh Badhya
>            Assignee: Sourabh Badhya
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to