Yun Tang created FLINK-26550:
--------------------------------

             Summary: Correct the information of checkpoint failure 
                 Key: FLINK-26550
                 URL: https://issues.apache.org/jira/browse/FLINK-26550
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing
            Reporter: Yun Tang
            Assignee: Yun Tang
             Fix For: 1.15.0, 1.14.5


After FLINK-26049, all failed checkpoint would print message with {{ Failed to 
trigger checkpoint }}:


{code:java}
5812 [pool-5-thread-1] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering 
checkpoint 1 (type=CheckpointType{name='Checkpoint', 
sharingFilesStrategy=FORWARD_BACKWARD}) @ 1646825286424 for job 
d2fd07b3b33af453a4e115f3197f81bb.
5913 [pool-5-thread-1] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Checkpoint 1 of 
job d2fd07b3b33af453a4e115f3197f81bb expired before completing.
451518 [pool-5-thread-1] WARN  
org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to 
trigger checkpoint 1 for job d2fd07b3b33af453a4e115f3197f81bb. (0 consecutive 
failed attempts so far)
org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint expired 
before completing.
        at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:2172)
 [classes/:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_292]
        at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) 
[?:1.8.0_292]
        at java.util.concurrent.FutureTask.run(FutureTask.java) [?:1.8.0_292]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_292]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_292]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_292]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_292]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
{code}

This is extremely strange as the failure does not happen during the trigger 
phase.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to