klion26 commented on code in PR #3640:
URL: https://github.com/apache/amoro/pull/3640#discussion_r2212839322


##########
amoro-ams/src/main/java/org/apache/amoro/server/DefaultOptimizingService.java:
##########
@@ -512,10 +515,19 @@ public void run() {
     }
 
     private void retryTask(TaskRuntime<?> task, OptimizingQueue queue) {
-      LOG.info(
-          "Task {} is suspending, since it's optimizer is expired, put it to 
retry queue, optimizer {}",
-          task.getTaskId(),
-          task.getResourceDesc());
+      if (task.getStatus() == TaskRuntime.Status.ACKED
+          && task.getStartTime() + taskExecuteTimeout < 
System.currentTimeMillis()) {
+        LOG.warn(
+            "Task {} is suspending at ACK status (start time: {}), put it to 
retry queue, optimizer {}",

Review Comment:
   could we enhance the log more, something like `(start time: {}), maybe the 
execution has been finished, but amoro did not received the complete`



##########
amoro-ams/src/main/java/org/apache/amoro/server/AmoroManagementConf.java:
##########
@@ -380,6 +380,12 @@ public class AmoroManagementConf {
           .defaultValue(Duration.ofSeconds(30))
           .withDescription("Timeout duration for task acknowledgment.");
 
+  public static final ConfigOption<Duration> OPTIMIZER_TASK_EXECUTE_TIMEOUT =
+      ConfigOptions.key("optimizer.task-execute-timeout")
+          .durationType()
+          .defaultValue(Duration.ofHours(5))
+          .withDescription("Timeout duration for task execution, default to 5 
hours.");

Review Comment:
   IMHO, the default value for this can be changed to `1`/`2` hours. If a 
single task is executed more than `1`/`2` hours, it is probably completed. As 
this can cover most cases, we may change the default value to `1`/`2` hours



##########
dist/src/main/amoro-bin/conf/config.yaml:
##########
@@ -55,6 +55,7 @@ ams:
   optimizer:
     heart-beat-timeout: 1min # 60000
     task-ack-timeout: 30s # 30000
+    task-execute-timeout: 5h # 18000000

Review Comment:
   ditto



##########
amoro-ams/src/test/java/org/apache/amoro/server/TestDefaultOptimizingService.java:
##########
@@ -311,6 +311,36 @@ public void testAckAndCompleteTask() {
     assertTaskCompleted(taskRuntime);
   }
 
+  @Test
+  public void testExecuteTaskTimeOutAndRetry() throws InterruptedException {

Review Comment:
   Thanks for adding the test to cover the logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to