abstractdog commented on code in PR #4899:
URL: https://github.com/apache/hive/pull/4899#discussion_r2324732623


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java:
##########
@@ -293,6 +293,13 @@ protected void initializeAndRunProcessor(Map<String, 
LogicalInput> inputs,
       rproc.run();
 
       perfLogger.perfLogEnd(CLASS_NAME, PerfLogger.TEZ_RUN_PROCESSOR);
+
+      // Try to call canCommit to AM. If there is no other speculative attempt 
execute canCommit, then continue.
+      // If there are other speculative attempt execute canCommit first, then 
wait until the attempt is killed
+      // or the committed task fails.
+      while (!this.processorContext.canCommit()) {

Review Comment:
   thanks @zhengchenyu  for the explanation, it really makes sense, I agree 
that it's a rare edge-case that another task starts to commit, then it fails, 
but if it happens, the current task should react on time, so for this purpose, 
max. 500ms looks good, also the exponential backoff might be an overcomplicated 
approach, as if canCommit == false, we can expect this task to be killed 
anyway, and 500ms won't overload the AM, so I'm fine with Thread.sleep(500) 
with an extra comment above sleep call that contains your explanation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to