akalash commented on pull request #16101:
URL: https://github.com/apache/flink/pull/16101#issuecomment-856759212


   > IIUC, the changes only clarify the timings but don't add any new 
information (checkpointDuration was logged before; finalizationTime can be 
infered from log message timestamps).
   
   It is true, but it is hidden knowledge. As you can see in the ticket(and I 
agree with that), everybody expected that difference between 'Triggering 
checkpoint' and 'Completed checkpoint' would be equal to checkpoint duration 
which is not true. My changes just clarify this situation in order to remove 
misunderstanding.
   
   > WDYT about logging the duration of 
CheckpointCoordinator.dropSubsumedCheckpoints and 
CheckpointSubsumeHelper.subsume?
   
   It is not even the suspect. It is definitely the reason for the delay(more 
precisely org.apache.flink.runtime.checkpoint.CompletedCheckpoint#discard -> 
FileStateHandle#discardState). But I don't think that adding extra time for 
subsume helps us somehow because subsume is too complex by itself and we need 
to have time for every step inside of subsume in so on. So in general, I also 
thought about that and it looks like a good idea to have some universal time 
tracker which can be used to measure different steps of the checkpoint but I 
don't think that we want to do it now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to