JesseAtSZ commented on PR #20091:
URL: https://github.com/apache/flink/pull/20091#issuecomment-1173787366

   > Thanks for the PR @JesseAtSZ ,
   > 
   > I'm trying to understand why checkpoints are failing with 
`FINALIZE_CHECKPOINT_FAILURE` (which is ignored by `CheckpointFailureManager`) 
and not something like `IOException`. From the code, it might happen only in 
`CheckpointCoordinator` - when all the tasks have already acknowleged the 
checkpoint. That probably means that the job is stateless. Could you confirm 
that @JesseAtSZ ?
   > 
   > If that's NOT the case then we probably should fix failure counting first.
   > 
   > Another question is related to the TM - do we need a symmetric check 
there? (in `FsCheckpointStorageAccess.resolveCheckpointStorageLocation`).
   
   I read the code, `FINALIZE_CHECKPOINT_FAILURE ` only occurs in the 
checkpoint coordinator. In addition, I don't think it is necessary to check the 
path on TM, because the initialization in Coordinator will be before 
`performCheckpoint`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to