smattheis commented on a change in pull request #18765:
URL: https://github.com/apache/flink/pull/18765#discussion_r816799727



##########
File path: docs/content/docs/ops/state/checkpoint_vs_savepoint.md
##########
@@ -0,0 +1,86 @@
+---
+title: "Checkpoint VS Savepoint"
+weight: 10
+type: docs
+aliases:
+  - /ops/state/checkpoint_vs_savepoint.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Checkpoint VS Savepoint
+
+## Overview
+
+Conceptually, Flink's [Savepoints]({{< ref "docs/ops/state/savepoints" >}}) 
are different from [Checkpoints]({{< ref "docs/ops/state/checkpoints" >}}) 
+in a similar way that backups are different from recovery logs in traditional 
database systems. 
+The primary purpose of Checkpoints is to provide a recovery mechanism in case 
of unexpected job failures. 
+A [Checkpoint's lifecycle]({{< ref 
"docs/dev/datastream/fault-tolerance/checkpointing" >}}) is managed by Flink, 
+i.e. a Checkpoint is created, owned, and released by Flink - without user 
interaction. 
+As a method of recovery and being periodically triggered, two main design 
goals for the Checkpoint implementation are 
+i) being as lightweight to create and ii) being as fast to restore from as 
possible. 
+Optimizations towards those goals can exploit certain properties, e.g. that 
the job code doesn't change between the execution attempts. 
+Checkpoints are usually dropped after the job was terminated by the user 
(except if explicitly configured as retained Checkpoints).
+
+In contrast to all this, Savepoints are created, owned, and deleted by the 
user. 
+Their use-case is for planned, manual backup and resume. For example, this 
could be an update of your Flink version, changing your job graph,
+changing parallelism, forking a second job like for a red/blue deployment, and 
so on. 
+Of course, Savepoints must survive job termination. Conceptually, Savepoints 
can be a bit more expensive 
+to produce and restore and focus more on portability and support for the 
previously mentioned changes to the job.
+
+### The main checkpoint differences
+
+[Checkpoints]({{< ref "docs/ops/state/checkpoints" >}})  have a few 
differences from [savepoints]({{< ref "docs/ops/state/savepoints" >}}). They
+- use a state backend specific (low-level) data format, may be incremental. 
(starting from Flink 1.15 savepoints can also use the backend [native]({{< ref 
"docs/ops/state/savepoints" >}}#savepoint-format) format.)
+- do not support Flink specific features like rescaling.

Review comment:
       Yes, that definitely contradicts the information of the table. Could you 
please clarify that?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to