Harshal Patel created HIVE-29459:
------------------------------------

             Summary: [DR][ACIDReplication] Add clearDanglingTxnTaskTask at the 
end
                 Key: HIVE-29459
                 URL: https://issues.apache.org/jira/browse/HIVE-29459
             Project: Hive
          Issue Type: Bug
          Components: repl
    Affects Versions: 4.2.0
            Reporter: Harshal Patel
            Assignee: Harshal Patel


Currently, at the end of replLoadTask, clearDanglingTxnTaskTask is added. That 
works in normal scenario

 
{code:java}
if (conf.getBoolVar(HiveConf.ConfVars.HIVE_REPL_CLEAR_DANGLING_TXNS_ON_TARGET)) 
{      ClearDanglingTxnWork clearDanglingTxnWork = new 
ClearDanglingTxnWork(work.getDumpDirectory(), targetDb.getName());
      Task<ClearDanglingTxnWork> clearDanglingTxnTaskTask = 
TaskFactory.get(clearDanglingTxnWork, conf);
      if (childTasks.isEmpty()) {
        childTasks.add(clearDanglingTxnTaskTask);
      } else {
        DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(Collections.singletonList(clearDanglingTxnTaskTask)));
      }
    }    return 0; {code}
 

[https://github.com/apache/hive/blob/38a963540000729f0ac8e8d2ac9cd1ca22930d2a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java#L966]

But if the no of events for incremental load is > 
{{hive.repl.approx.max.load.tasks then Load operation can break down the tasks 
into batches of approx }}{{hive.repl.approx.max.load.tasks}}{{ (Not a hard 
limit)}}

{{In this case, it can lead to pre-maturely cleaning of repl_txn_map and 
aborting the transaction in between the replication because 
clearDanglingTxnTaskTask gets called in between the batches rather than calling 
at the end only once per Load cycle. }}

{{Fix:}}

{{Add an additional check}}

{{i.e }}

{{}}
{code:java}
boolean hasPendingIncrementalWork = builder.hasMoreWork() || 
work.hasBootstrapLoadTasks();
if (conf.getBoolVar(HiveConf.ConfVars.HIVE_REPL_CLEAR_DANGLING_TXNS_ON_TARGET)
        && !hasPendingIncrementalWork) { {code}
{{}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to