Myasuka commented on code in PR #21281:
URL: https://github.com/apache/flink/pull/21281#discussion_r1018741944


##########
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointFailureManager.java:
##########
@@ -204,7 +204,8 @@ private void checkFailureAgainstCounter(
             if (continuousFailureCounter.get() > tolerableCpFailureNumber) {
                 clearCount();
                 errorHandler.accept(
-                        new 
FlinkRuntimeException(EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE));
+                        new FlinkRuntimeException(
+                                EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE, 
exception));

Review Comment:
   The job failed due to the failure counter being larger than the tolerable 
number, and we can only have the exception reason for the last broken 
checkpoint. However, this would make users think all checkpoints failed due to 
the last exception. The correct way is to let users check the job manager logs 
or checkpoint UI to know what happened in the last checkpoints.
   From my point of view, I am +0 for this proposal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to