leletan commented on code in PR #20852:
URL: https://github.com/apache/flink/pull/20852#discussion_r988631290


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerNG.java:
##########
@@ -125,7 +127,8 @@ void notifyKvStateUnregistered(
     CompletableFuture<String> triggerSavepoint(
             @Nullable String targetDirectory, boolean cancelJob, 
SavepointFormatType formatType);
 
-    CompletableFuture<String> triggerCheckpoint();
+    CompletableFuture<CompletedCheckpoint> triggerCheckpoint(
+            @Nullable CheckpointType checkpointType);

Review Comment:
   Good question. Correct, this is nullable here but not nullable in the 
request body. The reason is a little bit complicated to explain:
   At essence, the null is only to take care of 
[this](https://github.com/apache/flink/pull/17278) special case of legacy
   `CompletableFuture<String> triggerCheckpoint(@RpcTimeout final Time 
timeout)` function.
   
   Here are some more context:
   
   The call sequence of the above mentioned legacy function is 
   **_Minicluster -> DispatcherGateway -> Dispatcher -> JobMasterGateway -> 
..._** 
   and it expect a return type of `CompletableFuture<String>` (String for 
checkpointLocation). This one does not have any CheckpointType in the function 
parameter.
   
   The call sequence of the new API is 
   **_CheckpointTriggerHandler -> RestfulGateway -> Dispatcher -> 
JobMasterGateway -> ..._** 
   and it expect a return type of `CompletableFuture<Long>` (Long for 
checkpointId). In this call sequence we expect `checkpointType` to be `nonNull`
   
   To avoid keeping 2 very similar functions down the same call sequence 
(**_...-> JobMasterGateway -> JobMaster -> SchedulerNG -> Scheduler -> ..._**) 
and duplicated a lot of code, we combine them through
   ```
       default CompletableFuture<String> triggerCheckpoint(@RpcTimeout final 
Time timeout) {
           return triggerCheckpoint(null, 
timeout).thenApply(CompletedCheckpoint::getExternalPointer);
       }
   ``` 
   in JobMasterGateway.
   
   We may also combine them in `Dispatcher`  - the first class where 2 call 
sequence meet, however the legacy call sequence already had a non HA safe 
implementation (more context 
[here](https://issues.apache.org/jira/browse/FLINK-18312)) which is good enough 
for Minicluster but not good enough for RestfulAPI. 
   
   Please let me know if you have any questions about above.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to