[ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773780&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773780
 ]

ASF GitHub Bot logged work on HIVE-26244:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/May/22 22:17
            Start Date: 23/May/22 22:17
    Worklog Time Spent: 10m 
      Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879906851


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##########
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
             return response;
           }
         }
+
+        if (isValidTxn(txnId)) {
+          LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+                  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+          if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   
   We do not know at what stage the 1st query can abort. As it is 
non-deterministic, we needed to make an assumption.
   
   So when this is enabled via the conf, we will be optimistic about the 
outcome and assume the 1st query always succeeds. With this assumption, we can 
fail-early the 2nd concurrent ctas query and prevent any unnecessary move tasks 
and clean up that would have been associated with the 2nd query, if it was to 
continue until the commit stage. Also, the 2nd user will not have to wait for a 
long time to find out that the query failed.
   
   But when this feature is disabled, the query will run with a pessimistic 
assumption that the 1st query can abort. As a result, it does not fail the 2nd 
query until the commit stage. This will result in a lot of overhead and 
clean-up associated with the failed query.  This may also make the user wait 
for a long time only to find out that the query failed which I think is not 
ideal.
   
   This was my thought process. Would this be fine?
   





Issue Time Tracking
-------------------

    Worklog Id:     (was: 773780)
    Time Spent: 1.5h  (was: 1h 20m)

> Implementing locking for concurrent ctas
> ----------------------------------------
>
>                 Key: HIVE-26244
>                 URL: https://issues.apache.org/jira/browse/HIVE-26244
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Simhadri G
>            Assignee: Simhadri G
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to