Copilot commented on code in PR #17833:
URL: https://github.com/apache/pinot/pull/17833#discussion_r2896977065


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/ZkBasedTableRebalanceObserver.java:
##########
@@ -226,13 +224,25 @@ public void onSuccess(String msg) {
   @Override
   public void onError(String errorMsg) {
     _controllerMetrics.setValueOfTableGauge(_tableNameWithType, 
ControllerGauge.TABLE_REBALANCE_IN_PROGRESS, 0);
-    long timeToFinishInSeconds = (System.currentTimeMillis() - 
_tableRebalanceProgressStats.getStartTimeMs()) / 1000L;
-    
_tableRebalanceProgressStats.setTimeToFinishInSeconds(timeToFinishInSeconds);
+    
_tableRebalanceProgressStats.setTimeToFinishInSeconds(computeElapsedTimeInSeconds());
     _tableRebalanceProgressStats.setStatus(RebalanceResult.Status.FAILED);
     _tableRebalanceProgressStats.setCompletionStatusMsg(errorMsg);
     trackStatsInZk();
   }
 
+  /**
+   * Safely computes elapsed time in seconds since rebalance started.
+   * Returns 0 if startTimeMs was never set (i.e. still at default 0), which 
happens
+   * when the rebalance completes as NO_OP or fails before START_TRIGGER fires.
+   */
+  private long computeElapsedTimeInSeconds() {
+    long startTimeMs = _tableRebalanceProgressStats.getStartTimeMs();
+    if (startTimeMs <= 0) {
+      return 0L;
+    }
+    return (System.currentTimeMillis() - startTimeMs) / 1000L;
+  }

Review Comment:
   computeElapsedTimeInSeconds() introduces new behavior for 
NO_OP/early-failure cases (startTimeMs<=0 => 0 seconds). Please add/extend unit 
tests (e.g., in TestZkBasedTableRebalanceObserver) to cover onNoop/onError when 
startTimeMs was never initialized, so we don’t regress back to the “Unix 
timestamp” behavior.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/services/PinotTableReloadStatusReporter.java:
##########
@@ -80,6 +81,26 @@ private static double 
computeEstimatedRemainingTimeInMinutes(PinotTableReloadSta
     return estimatedRemainingTimeInMinutes;
   }
 
+  /**
+   * Derives the overall reload job status from aggregated counts.
+   * - COMPLETED: all segments reloaded successfully, no server call failures
+   * - COMPLETED_WITH_ERRORS: reload finished but some segments failed
+   * - IN_PROGRESS: reload is still running
+   */
+  private static String deriveReloadStatus(PinotTableReloadStatusResponse 
response) {
+    int processed = response.getSuccessCount() + (response.getFailureCount() 
!= null
+        ? response.getFailureCount().intValue() : 0);
+    boolean allProcessed = processed >= response.getTotalSegmentCount();
+
+    if (allProcessed && response.getTotalServerCallsFailed() == 0) {
+      if (response.getFailureCount() != null && response.getFailureCount() > 
0) {
+        return "COMPLETED_WITH_ERRORS";
+      }
+      return "COMPLETED";
+    }
+    return "IN_PROGRESS";
+  }

Review Comment:
   New reload status derivation and the remaining-time clamping introduce new 
behavior (e.g., COMPLETED vs COMPLETED_WITH_ERRORS vs IN_PROGRESS, and 
successCount > totalSegmentCount handling) but there are no unit tests covering 
these branches. Please add tests in PinotTableReloadStatusReporterTest to lock 
in the expected status/ETA for: (1) all success, (2) failures present, (3) 
server-call failures, and (4) successCount > totalSegmentCount.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/services/PinotTableReloadStatusReporter.java:
##########
@@ -80,6 +81,26 @@ private static double 
computeEstimatedRemainingTimeInMinutes(PinotTableReloadSta
     return estimatedRemainingTimeInMinutes;
   }
 
+  /**
+   * Derives the overall reload job status from aggregated counts.
+   * - COMPLETED: all segments reloaded successfully, no server call failures
+   * - COMPLETED_WITH_ERRORS: reload finished but some segments failed
+   * - IN_PROGRESS: reload is still running
+   */
+  private static String deriveReloadStatus(PinotTableReloadStatusResponse 
response) {
+    int processed = response.getSuccessCount() + (response.getFailureCount() 
!= null
+        ? response.getFailureCount().intValue() : 0);
+    boolean allProcessed = processed >= response.getTotalSegmentCount();
+
+    if (allProcessed && response.getTotalServerCallsFailed() == 0) {
+      if (response.getFailureCount() != null && response.getFailureCount() > 
0) {
+        return "COMPLETED_WITH_ERRORS";
+      }
+      return "COMPLETED";
+    }
+    return "IN_PROGRESS";

Review Comment:
   deriveReloadStatus() treats any non-zero totalServerCallsFailed as 
"IN_PROGRESS" even when all segments appear processed. If a server call fails 
(transiently or permanently), the reload job may be finished but this status 
would never become COMPLETED/COMPLETED_WITH_ERRORS, undermining the purpose of 
adding a status field. Consider returning COMPLETED_WITH_ERRORS when all 
segments are processed but there were server-call failures and/or segment 
failures (e.g., failureCount>0 or segmentReloadFailures non-empty).
   ```suggestion
      * - COMPLETED: all segments reloaded successfully, no segment or server 
call failures
      * - COMPLETED_WITH_ERRORS: reload finished but some segments and/or 
server calls failed
      * - IN_PROGRESS: reload is still running (not all segments processed yet)
      */
     private static String deriveReloadStatus(PinotTableReloadStatusResponse 
response) {
       int failureCount = response.getFailureCount() != null ? 
response.getFailureCount() : 0;
       int processed = response.getSuccessCount() + failureCount;
       boolean allProcessed = processed >= response.getTotalSegmentCount();
   
       // If not all segments have been accounted for, the reload is still in 
progress.
       if (!allProcessed) {
         return "IN_PROGRESS";
       }
   
       // Reload has finished (all segments processed). Determine if it 
completed cleanly or with errors.
       boolean hasSegmentFailures = failureCount > 0;
       boolean hasServerCallFailures = response.getTotalServerCallsFailed() > 0;
       boolean hasSegmentReloadFailures = false;
       if (response.getSegmentReloadFailures() != null && 
!response.getSegmentReloadFailures().isEmpty()) {
         hasSegmentReloadFailures = true;
       }
   
       if (hasSegmentFailures || hasServerCallFailures || 
hasSegmentReloadFailures) {
         return "COMPLETED_WITH_ERRORS";
       }
       return "COMPLETED";
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to