Renxiang Zhou created FLINK-32881:
-------------------------------------
Summary: Client supports making savepoints in detach mode
Key: FLINK-32881
URL: https://issues.apache.org/jira/browse/FLINK-32881
Project: Flink
Issue Type: Improvement
Components: API / State Processor, Client / Job Submission
Affects Versions: 1.19.0
Reporter: Renxiang Zhou
Fix For: 1.19.0
Attachments: image-2023-08-16-17-14-34-740.png,
image-2023-08-16-17-14-44-212.png
When triggering a savepoint using the command-line tool, the client needs to
wait for the job to finish creating the savepoint before it can exit. For jobs
with large state, the savepoint creation process can be time-consuming, leading
to the following problems:
# Platform users may need to manage thousands of Flink tasks on a single
client machine. With the current savepoint triggering mode, all savepoint
creation threads on that machine have to wait for the job to finish the
snapshot, resulting in significant resource waste;
# If the savepoint producing time exceeds the client's timeout duration, the
client will throw a timeout exception and report that the trggering savepoint
process fails. Since different jobs have varying savepoint durations, it is
difficult to adjust the client's timeout parameter.
Therefore, we propose adding a detach mode to trigger savepoints on the client
side, just similar to the detach mode behavior when submitting jobs. Here are
some specific details:
# The savepoint UUID will be generated on the client side. After successfully
triggering the savepoint, the client immediately returns the UUID information.
# Add a "dump-pending-savepoints" API interface that allows the client to
check whether the triggered savepoint has been successfully created.
By implementing these changes, the client can detach from the savepoint
creation process, reducing resource waste, and providing a way to check the
status of savepoint creation.
!image-2023-08-16-17-14-34-740.png!!image-2023-08-16-17-14-44-212.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)