Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-13 Thread via GitHub


masteryhx commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1426080866


##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -188,6 +207,21 @@ CompletableFuture triggerSavepoint(
  */
 CompletableFuture triggerCheckpoint(JobID jobId, CheckpointType 
checkpointType);
 
+/**
+ * Triggers a detached savepoint for the job identified by the job id. The 
savepoint will be
+ * written to the given savepoint directory, or {@link
+ * 
org.apache.flink.configuration.CheckpointingOptions#SAVEPOINT_DIRECTORY} if it 
is null.
+ * Notice that: detach savepoint will return with a savepoint trigger id 
instead of the path
+ * future, that means the client will return very quickly.
+ *
+ * @param jobId job id
+ * @param savepointDirectory directory the savepoint should be written to
+ * @param formatType a binary format of the savepoint
+ * @return The savepoint trigger id
+ */
+CompletableFuture triggerDetachSavepoint(

Review Comment:
   Sorry for misleading.
   I just talked about naming.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-13 Thread via GitHub


masteryhx commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1854996169

   > @masteryhx @Zakelly Following your constructive comments, I have 
rearranged the commits and rebased to the master branch, please have a look, 
many thanks~
   
   Thanks for the update.
   I saw some minor comments haven't been resolved. 
   Do you have any other concern ? We could discuss in every comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-13 Thread via GitHub


masteryhx commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1426092346


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -164,6 +164,14 @@ public class CliFrontendParser {
 false,
 "Defines whether to trigger this checkpoint as a full 
one.");
 
+static final Option SAVEPOINT_DETACH_OPTION =

Review Comment:
   1. How about just using 'detached' here aligned with DETACHED_OPTION ? It 
should not be conflict with other one because it will only be checked in 
savepoint and stop-with-savepoint command, right ? It will help us migrating in 
the future.
   2. I saw this could also be used for manual checkpoint. Of course, we could 
support this in other ticket/pr.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-14 Thread via GitHub


xiangforever2014 commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1427483790


##
docs/content.zh/docs/deployment/cli.md:
##
@@ -125,6 +125,43 @@ Lastly, you can optionally provide what should be the 
[binary format]({{< ref "d
 
 The path to the savepoint can be used later on to [restart the Flink 
job](#starting-a-job-from-a-savepoint).
 
+If the state of the job is quite big, the client will get a timeout exception 
since it should wait for the savepoint finished.
+```
+Triggering savepoint for job bec5244e09634ad71a80785937a9732d.
+Waiting for response...
+
+--
+The program finished with the following exception:
+
+org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
bec5244e09634ad71a80785937a9732d failed.
+at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend. java:828)
+at 
org.apache.flink.client.cli.CliFrontend.lambda$savepopint$8(CliFrontend.java:794)
+at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1078)
+at 
org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:779)
+at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1150)
+at 
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1226)
+at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
+at 
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1226)
+at org.apache.flink.client.cli.CliFrontend.main(CliFronhtend.java:1194)
+Caused by: java.util.concurrent.TimeoutException
+at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
+at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
+at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:822)
+... 8 more
+```
+In this case, we could use "-dsp" option to trigger a detached savepoint, the 
client will return immediately as soon as the trigger id returns.
+```bash
+$ ./bin/flink savepoint \
+  $JOB_ID \ 
+  /tmp/flink-savepoints
+  -dsp
+```
+```
+Triggering savepoint in detached mode for job bec5244e09634ad71a80785937a9732d.
+Successfully trigger manual savepoint, triggerId: 
2505bbd12c5b58fd997d0f193db44b97
+```
+We can get the status of the detached savepoint by rest api 
"/jobs/:jobid/savepoints/:triggerid".

Review Comment:
   done~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-14 Thread via GitHub


xiangforever2014 commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1427486664


##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -188,6 +207,21 @@ CompletableFuture triggerSavepoint(
  */
 CompletableFuture triggerCheckpoint(JobID jobId, CheckpointType 
checkpointType);
 
+/**
+ * Triggers a detached savepoint for the job identified by the job id. The 
savepoint will be
+ * written to the given savepoint directory, or {@link
+ * 
org.apache.flink.configuration.CheckpointingOptions#SAVEPOINT_DIRECTORY} if it 
is null.
+ * Notice that: detach savepoint will return with a savepoint trigger id 
instead of the path
+ * future, that means the client will return very quickly.
+ *
+ * @param jobId job id
+ * @param savepointDirectory directory the savepoint should be written to
+ * @param formatType a binary format of the savepoint
+ * @return The savepoint trigger id
+ */
+CompletableFuture triggerDetachSavepoint(

Review Comment:
   Thanks for your careful comments, I have rename it to "detached" and checked 
all other places~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-14 Thread via GitHub


xiangforever2014 commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1427489342


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -164,6 +164,14 @@ public class CliFrontendParser {
 false,
 "Defines whether to trigger this checkpoint as a full 
one.");
 
+static final Option SAVEPOINT_DETACH_OPTION =

Review Comment:
   1. Good advice, I think using 'detached' option here can make it easily for 
users to understand, thanks for comment.
   2. Yes, I have notice this point too, maybe we could finish this ticket 
first ~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-16 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1858801528

   @masteryhx  Hello, the CI is succeed, thank you~
   https://github.com/apache/flink/assets/23545296/76f908e7-5652-4a48-825a-96e42755970b";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-18 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1862077615

   @masteryhx kindly remind~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-26 Thread via GitHub


masteryhx closed pull request #23253: [FLINK-32881][checkpoint] Support 
triggering savepoint in detach mode for CLI and dumping all pending savepoint 
ids by rest api 
URL: https://github.com/apache/flink/pull/23253


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-20 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1820158718

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-20 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1820273335

   @masteryhx Many thanks for your suggestion, and I have followed your advice 
to simplify the code.
   I have tested in my local machine, when triggering a detached savepoint, the 
CLI will return with a trigger id immediately, and I could use the trigger id 
to get the status of the savepoint by REST API 
[/jobs/:jobid/savepoints/:triggerid 
](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/ops/rest_api/#jobs-jobid-savepoints-triggerid).
   
   Triggering a savepoint vs. Triggering a detached savepoint:
   1. Triggering a savepoint: from the picture, we could see that the CLI 
should wait for the savepoint to be finished, and it is very easy to cause a 
CLI timeout when the state of the job is huge.
   
![image](https://github.com/apache/flink/assets/23545296/ee4e5cf2-cb19-40c3-bcc2-3c558c952593)
   
   2. Triggering a detached savepoint: the CLI returned with the trigger id 
immediately, and we could get the status of the savepoint by REST API.
   
![img_v3_025d_dad79492-1bae-4843-bb1e-42370cf2b52g](https://github.com/apache/flink/assets/23545296/d0b20561-bf0e-42cc-97a5-bfb3c387c29c)
   
   - when the savepoint has not finished, we get
   
![img_v3_025d_982195b1-b21c-418f-8752-73d13ff2564g](https://github.com/apache/flink/assets/23545296/daa1874f-2902-47f8-8efe-3c8923dbb225)
   
   - once the savepoint finished, we get
   
![img_v3_025d_e064e7aa-147b-4a31-b3eb-9bc3accaa85g](https://github.com/apache/flink/assets/23545296/011c8e52-bd37-4215-b09f-31dcc013f7b0)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-26 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1827056077

   @masteryhx hope to get your reply, many thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-26 Thread via GitHub


masteryhx commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1403926743


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =

Review Comment:
   Whatever we define a new option or just reuse existing one, we should update 
all related docs.



##
flink-clients/src/main/java/org/apache/flink/client/program/rest/RestClusterClient.java:
##
@@ -608,6 +616,27 @@ private CompletableFuture triggerSavepoint(
 });
 }
 
+private CompletableFuture triggerDetachSavepoint(
+final JobID jobId,
+final @Nullable String savepointDirectory,
+final boolean cancelJob,
+final SavepointFormatType formatType) {
+final SavepointTriggerHeaders savepointTriggerHeaders =

Review Comment:
   The previous logic is similar with savepoint, right ?
   Could we unify them into a method ?



##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =
+new Option(
+"dcp",

Review Comment:
   BTW, I also thinked this again.
   Could `stop-with-savepoint` also has this problem ?
   If ture, I think we could resolve this together (you could add a new commit).
   Then maybe reusing "detached" command looks good to me.



##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =
+new Option(
+"dcp",

Review Comment:
   How about "dsp" ?
   Or Could we just reuse existing "detached" ?



##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -178,6 +178,23 @@ CompletableFuture stopWithSavepoint(
 CompletableFuture triggerSavepoint(
 JobID jobId, @Nullable String savepointDirectory, 
SavepointFormatType formatType);
 
+/**
+ * Triggers a detach savepoint for the job identified by the job id. The 
savepoint will be
+ * written to the given savepoint directory, or {@link
+ * 
org.apache.flink.configuration.CheckpointingOptions#SAVEPOINT_DIRECTORY} if it 
is null.
+ * Notice that: detach savepoint will return with a savepoint trigger id 
instead of the path
+ * future, that means the client will return very quickly.
+ *
+ * @param jobId job id
+ * @param savepointDirectory directory the savepoint should be written to
+ * @param formatType a binary format of the savepoint
+ * @return The savepoint trigger id
+ */
+default CompletableFuture triggerDetachSavepoint(
+JobID jobId, @Nullable String savepointDirectory, 
SavepointFormatType formatType) {
+return triggerSavepoint(jobId, savepointDirectory, formatType);

Review Comment:
   Do we really need the default method ?
   It's not a public interface, and seems the default semantic doesn't match 
`detach`



##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -178,6 +178,23 @@ CompletableFuture stopWithSavepoint(
 CompletableFuture triggerSavepoint(
 JobID jobId, @Nullable String savepointDirectory, 
SavepointFormatType formatType);
 
+/**
+ * Triggers a detach savepoint for the job identified by the job id. The 
savepoint will be

Review Comment:
   ```suggestion
* Triggers a detached savepoint for the job identified by the job id. 
The savepoint will be
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-27 Thread via GitHub


xiangforever2014 commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1405754517


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =

Review Comment:
   Thanks for your comment, I have updated the related docs manually.



##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =
+new Option(
+"dcp",

Review Comment:
   These is a "-d" option for savepoints already(SAVEPOINT_DISPOSE_OPTION), so 
I think we can not just simply reuse the existing detached 
option(DETACHED_OPTION) since it's short option is also "-d", if we force to 
reuse the option, we need to modify either SAVEPOINT_DISPOSE_OPTION or 
DETACHED_OPTION,  which may cause some influence to existing users, so I think 
it's better to add a new option to support detached savepoint, WDYT?



##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -178,6 +178,23 @@ CompletableFuture stopWithSavepoint(
 CompletableFuture triggerSavepoint(
 JobID jobId, @Nullable String savepointDirectory, 
SavepointFormatType formatType);
 
+/**
+ * Triggers a detach savepoint for the job identified by the job id. The 
savepoint will be
+ * written to the given savepoint directory, or {@link
+ * 
org.apache.flink.configuration.CheckpointingOptions#SAVEPOINT_DIRECTORY} if it 
is null.
+ * Notice that: detach savepoint will return with a savepoint trigger id 
instead of the path
+ * future, that means the client will return very quickly.
+ *
+ * @param jobId job id
+ * @param savepointDirectory directory the savepoint should be written to
+ * @param formatType a binary format of the savepoint
+ * @return The savepoint trigger id
+ */
+default CompletableFuture triggerDetachSavepoint(
+JobID jobId, @Nullable String savepointDirectory, 
SavepointFormatType formatType) {
+return triggerSavepoint(jobId, savepointDirectory, formatType);

Review Comment:
   Thanks for your insightful comments, the reason we add the default method 
here is that we only want to implement the detached mode for rest api in this 
commit. I have resolved this comment, PTAL.



##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -178,6 +178,23 @@ CompletableFuture stopWithSavepoint(
 CompletableFuture triggerSavepoint(
 JobID jobId, @Nullable String savepointDirectory, 
SavepointFormatType formatType);
 
+/**
+ * Triggers a detach savepoint for the job identified by the job id. The 
savepoint will be

Review Comment:
   Thanks for your observant comments, resolved~



##
flink-clients/src/main/java/org/apache/flink/client/program/rest/RestClusterClient.java:
##
@@ -608,6 +616,27 @@ private CompletableFuture triggerSavepoint(
 });
 }
 
+private CompletableFuture triggerDetachSavepoint(
+final JobID jobId,
+final @Nullable String savepointDirectory,
+final boolean cancelJob,
+final SavepointFormatType formatType) {
+final SavepointTriggerHeaders savepointTriggerHeaders =

Review Comment:
   Thanks for your comment, I have unified the detached/non-detached savepoint 
to one method~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-27 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1827533221

   @masteryhx Many thanks for your insightful comments, I have updated the code 
according to the comments, PTAL, thanks again bro~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-29 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1831543321

   @masteryhx kindly remind, hope to get your reply, many thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-29 Thread via GitHub


masteryhx commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1832964678

   > @masteryhx kindly remind, hope to get your reply, many thanks~
   
   Seems there are some conflicts.
   Could you rebase the master and resolve them at first ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-11-29 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1833181042

   @masteryhx Hello, I have rebased the code to the master, PTAL~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-04 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1839926426

   @masteryhx Kindly remind, hope to get your reply, many thanks ^_^


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


masteryhx commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1415176143


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =
+new Option(
+"dcp",

Review Comment:
   Thanks for the clarification.
   Seems we have to use a new option because of the historical reason.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


masteryhx commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1415217180


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =
+new Option(
+"dcp",

Review Comment:
   It will be clearer If we could make it aligned in Flink 2.0.
   cc @Zakelly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


Zakelly commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1415245692


##
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java:
##
@@ -157,6 +157,14 @@ public class CliFrontendParser {
 + " for changing state backends, native = a 
specific format for the"
 + " chosen state backend, might be faster to take 
and restore from.");
 
+public static final Option SAVEPOINT_DETACH_OPTION =
+new Option(
+"dcp",

Review Comment:
   Hi @xiangforever2014 @masteryhx sorry for jumping in... And I'm wondering if 
we could use the capital '-D' for short option? Or I suggest we do not leave 
any short option for this, only using '-detach'.
   
   I agree we could adjust these CLI options in 2.0 .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


Zakelly commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1840388226

   Hi @xiangforever2014 @masteryhx sorry for jumping in... And I'm wondering if 
we could use the capital '-D' for short option? Or I suggest we do not leave 
any short option for this, only using '-detach'. WDYT?
   
   I agree we could adjust these CLI options in 2.0. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


masteryhx commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1415222035


##
docs/content.zh/docs/deployment/cli.md:
##
@@ -125,6 +125,43 @@ Lastly, you can optionally provide what should be the 
[binary format]({{< ref "d
 
 The path to the savepoint can be used later on to [restart the Flink 
job](#starting-a-job-from-a-savepoint).
 
+If the state of the job is quite big, the client will get a timeout exception 
since it should wait for the savepoint finished.

Review Comment:
   ```suggestion
   If the state size of the job is quite big, the client will get a timeout 
exception since it has to wait for the savepoint finished.
   ```



##
docs/content.zh/docs/deployment/cli.md:
##
@@ -125,6 +125,43 @@ Lastly, you can optionally provide what should be the 
[binary format]({{< ref "d
 
 The path to the savepoint can be used later on to [restart the Flink 
job](#starting-a-job-from-a-savepoint).
 
+If the state of the job is quite big, the client will get a timeout exception 
since it should wait for the savepoint finished.
+```
+Triggering savepoint for job bec5244e09634ad71a80785937a9732d.
+Waiting for response...
+
+--
+The program finished with the following exception:
+
+org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
bec5244e09634ad71a80785937a9732d failed.
+at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend. java:828)
+at 
org.apache.flink.client.cli.CliFrontend.lambda$savepopint$8(CliFrontend.java:794)
+at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1078)
+at 
org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:779)
+at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1150)
+at 
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1226)
+at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
+at 
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1226)
+at org.apache.flink.client.cli.CliFrontend.main(CliFronhtend.java:1194)
+Caused by: java.util.concurrent.TimeoutException
+at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
+at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
+at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:822)
+... 8 more
+```
+In this case, we could use "-dsp" option to trigger a detached savepoint, the 
client will return immediately as soon as the trigger id returns.

Review Comment:
   ```suggestion
   In this case, we could use "-dsp" option to trigger a detached savepoint, 
the client will return the trigger id immediately.
   ```



##
docs/content.zh/docs/deployment/cli.md:
##
@@ -125,6 +125,43 @@ Lastly, you can optionally provide what should be the 
[binary format]({{< ref "d
 
 The path to the savepoint can be used later on to [restart the Flink 
job](#starting-a-job-from-a-savepoint).
 
+If the state of the job is quite big, the client will get a timeout exception 
since it should wait for the savepoint finished.
+```
+Triggering savepoint for job bec5244e09634ad71a80785937a9732d.
+Waiting for response...
+
+--
+The program finished with the following exception:
+
+org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
bec5244e09634ad71a80785937a9732d failed.
+at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend. java:828)
+at 
org.apache.flink.client.cli.CliFrontend.lambda$savepopint$8(CliFrontend.java:794)
+at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1078)
+at 
org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:779)
+at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1150)
+at 
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1226)
+at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
+at 
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1226)
+at org.apache.flink.client.cli.CliFrontend.main(CliFronhtend.java:1194)
+Caused by: java.util.concurrent.TimeoutException
+at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
+at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
+at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:822)
+... 8 more
+```
+In this case, we could use "-dsp" option to trigger a detached savepoint, the 
client will return immediately as soon as the trigger id

Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


xiangforever2014 commented on code in PR #23253:
URL: https://github.com/apache/flink/pull/23253#discussion_r1416745308


##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -188,6 +207,21 @@ CompletableFuture triggerSavepoint(
  */
 CompletableFuture triggerCheckpoint(JobID jobId, CheckpointType 
checkpointType);
 
+/**
+ * Triggers a detached savepoint for the job identified by the job id. The 
savepoint will be
+ * written to the given savepoint directory, or {@link
+ * 
org.apache.flink.configuration.CheckpointingOptions#SAVEPOINT_DIRECTORY} if it 
is null.
+ * Notice that: detach savepoint will return with a savepoint trigger id 
instead of the path
+ * future, that means the client will return very quickly.
+ *
+ * @param jobId job id
+ * @param savepointDirectory directory the savepoint should be written to
+ * @param formatType a binary format of the savepoint
+ * @return The savepoint trigger id
+ */
+CompletableFuture triggerDetachSavepoint(

Review Comment:
   Thanks for your comment, sorry but I want to ensure that this means we 
should implement `stopWithDetachedSavepoint` by invoking 
`triggerDetachedSavepoint`? Or we should abstract an interface which is 
responsible for implementing normal detached savepoint and 
stop-with-detached-savepoint? I think currently it is clear that we use two 
different function to trigger savepoint in detached mode, that is 
`triggerDetachSavepoint` & `stopWithDetachedSavepoint`, do you want to merge 
these two into one function? Hope to get your opinion~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-05 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1842188948

   > Hi @xiangforever2014 @masteryhx sorry for jumping in... And I'm wondering 
if we could use the capital '-D' for short option? Or I suggest we do not leave 
any short option for this, only using '-detach'. WDYT?
   > 
   > And I agree we could adjust all these CLI options in 2.0.
   
   Thanks for your comment~ 
   But I wonder that is there any problem in current implementation? Or why we 
need to change it to "-D" or "-detach"?
   Hope to get your suggestion~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-07 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1845053733

   @masteryhx Hello, I have updated the code according to your comment, PTAL~
   Here are the test results:
   1. trigger the normal savepoint, the client gets a timeout exception:
   
![img_v3_025t_ebcc0196-37f9-43d8-99b8-1c5010005f4g](https://github.com/apache/flink/assets/23545296/bc3b6bf2-9c19-43e6-9270-9529d384c6cd)
   2. trigger the detached savepoint, the client returns the trigger id 
immediately, and we could monitor the status of the savepoint by rest api:
   
![img_v3_025t_fd81de0b-563c-456d-a8f2-7804df9f5d1g](https://github.com/apache/flink/assets/23545296/860f5603-25e9-4c97-a714-44d1b7196999)
   
![img_v3_025t_eea0bf11-2a1d-4605-8f8f-f5c894eb488g](https://github.com/apache/flink/assets/23545296/6e96da72-8fad-46ff-b01b-e4c43fe84eca)
   
![img_v3_025t_febc7d37-c4b8-4d8f-b7f4-7f1a7a8ffc9g](https://github.com/apache/flink/assets/23545296/1c7c4017-74f1-4582-abd4-bff4647e9135)
   3. trigger the stop-with-savepoint, the client also gets a timeout 
exception, and the job will be stopped after the savepoint finished:
   
![img_v3_025t_e673b71c-ebae-4041-9c1c-7a1c0a613cfg](https://github.com/apache/flink/assets/23545296/4b03255e-e136-4203-8f57-0a85e3462060)
   
![img_v3_025t_e2868559-70ac-4d71-8e0a-3546eee6a5bg](https://github.com/apache/flink/assets/23545296/6aff8082-b999-4a2b-b7bf-3bdee93837ba)
   4. trigger the stop-with-detached-savepoint, the client also returns the 
trigger id immediately, and we could get the status of the savepoint by rest 
api through the trigger id, after the savepoint finished, the job will be 
stopped too:
   
![img_v3_025t_d3eb2c37-f726-4a4c-804b-5fda5b81eeag](https://github.com/apache/flink/assets/23545296/cb28bd6f-b229-4d3a-8546-f30777ac341e)
   
![img_v3_025t_e72a53d1-2411-4d79-b9a9-cc9474c3efdg](https://github.com/apache/flink/assets/23545296/cab23e80-2bc0-4306-b06b-9a8cbd0e3ec7)
   
![img_v3_025t_a2b1cfb5-6cf2-45db-bc83-dc475667989g](https://github.com/apache/flink/assets/23545296/28d8bccf-ae5d-4c59-8fb0-14b7c847fd9c)
   
![img_v3_025t_069dd3a5-3848-43c6-81d0-99e7ebc808cg](https://github.com/apache/flink/assets/23545296/ba03b8a1-a17b-46d7-a539-3dbf0763f9e5)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-10 Thread via GitHub


Zakelly commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1849238533

   > > Hi @xiangforever2014 @masteryhx sorry for jumping in... And I'm 
wondering if we could use the capital '-D' for short option? Or I suggest we do 
not leave any short option for this, only using '-detach'. WDYT?
   > > And I agree we could adjust all these CLI options in 2.0.
   > 
   > Thanks for your comment~ But I wonder that is there any problem in current 
implementation? Or why we need to change it to "-D" or "-detach"? Hope to get 
your suggestion~
   
   Hi @xiangforever2014 . Short commands are designed to be readable or easy 
for users to remember. It can be omitted if there is no suitable abbreviation. 
The '-dcp' is not that easy to remember or understand, user may not know what 
it stands for. WDYT? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-11 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1849650290

   > > > Hi @xiangforever2014 @masteryhx sorry for jumping in... And I'm 
wondering if we could use the capital '-D' for short option? Or I suggest we do 
not leave any short option for this, only using '-detach'. WDYT?
   > > > And I agree we could adjust all these CLI options in 2.0.
   > > 
   > > 
   > > Thanks for your comment~ But I wonder that is there any problem in 
current implementation? Or why we need to change it to "-D" or "-detach"? Hope 
to get your suggestion~
   > 
   > Hi @xiangforever2014 . Short commands are designed to be readable or easy 
for users to remember. It can be omitted if there is no suitable abbreviation. 
The '-dcp' is not that easy to remember or understand, user may not know what 
it stands for. WDYT?
   
   Thanks for your suggestion, and I think that's indeed more friendly to 
users, I have followed your suggestion and use '-detach' for the detached mode~ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-12 Thread via GitHub


xiangforever2014 commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1851604601

   @masteryhx @Zakelly Following your constructive comments, I have rearranged 
the commits and rebased to the master branch, please have a look, many thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-12 Thread via GitHub


Zakelly commented on PR #23253:
URL: https://github.com/apache/flink/pull/23253#issuecomment-1851829103

   > > > > Hi @xiangforever2014 @masteryhx sorry for jumping in... And I'm 
wondering if we could use the capital '-D' for short option? Or I suggest we do 
not leave any short option for this, only using '-detach'. WDYT?
   > > > > And I agree we could adjust all these CLI options in 2.0.
   > > > 
   > > > 
   > > > Thanks for your comment~ But I wonder that is there any problem in 
current implementation? Or why we need to change it to "-D" or "-detach"? Hope 
to get your suggestion~
   > > 
   > > 
   > > Hi @xiangforever2014 . Short commands are designed to be readable or 
easy for users to remember. It can be omitted if there is no suitable 
abbreviation. The '-dcp' is not that easy to remember or understand, user may 
not know what it stands for. WDYT?
   > 
   > Thanks for your suggestion, and I think that's indeed more friendly to 
users, I have followed your suggestion and use '-detach' for the detached mode~
   
   That's cool. +1 from my side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org