(flink) branch master updated: [FLINK-34119][doc] Improve description about changelog in document

hangxiang Wed, 17 Jan 2024 18:23:07 -0800

This is an automated email from the ASF dual-hosted git repository.

hangxiang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git



The following commit(s) were added to refs/heads/master by this push:
     new 2ec8f8157f9 [FLINK-34119][doc] Improve description about changelog in 
document
2ec8f8157f9 is described below

commit 2ec8f8157f95a79ee94d609657f9b08f8f0b6a26
Author: Hangxiang Yu <master...@gmail.com>
AuthorDate: Sat Jan 13 14:50:36 2024 +0800

    [FLINK-34119][doc] Improve description about changelog in document
---
 docs/content.zh/docs/deployment/config.md        |  3 +--
 docs/content.zh/docs/ops/state/state_backends.md | 13 +++++++------
 docs/content/docs/deployment/config.md           |  3 +--
 docs/content/docs/ops/state/state_backends.md    | 15 ++++++++-------
 4 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/docs/content.zh/docs/deployment/config.md 
b/docs/content.zh/docs/deployment/config.md
index 34d04f733e5..cf0740bf8de 100644
--- a/docs/content.zh/docs/deployment/config.md
+++ b/docs/content.zh/docs/deployment/config.md
@@ -370,8 +370,7 @@ Advanced options to tune RocksDB and RocksDB checkpoints.
 ### State Changelog Options
 
 Please refer to [State Backends]({{< ref 
"docs/ops/state/state_backends#enabling-changelog" >}}) for information on
-using State Changelog. {{< hint warning >}} The feature is in experimental 
status. {{< /hint >}} {{<
-generated/state_backend_changelog_section >}}
+using State Changelog.
 
 #### FileSystem-based Changelog options
 
diff --git a/docs/content.zh/docs/ops/state/state_backends.md 
b/docs/content.zh/docs/ops/state/state_backends.md
index eda37dada7e..5d7d4f92b1c 100644
--- a/docs/content.zh/docs/ops/state/state_backends.md
+++ b/docs/content.zh/docs/ops/state/state_backends.md
@@ -349,10 +349,6 @@ Python API 中尚不支持该特性。
 
 ## 开启 Changelog
 
-{{< hint warning >}} 该功能处于实验状态。 {{< /hint >}}
-
-{{< hint warning >}} 开启 Changelog 可能会给您的应用带来性能损失。（见下文） {{< /hint >}}
-
 <a name="introduction"></a>
 
 ### 介绍
@@ -372,16 +368,21 @@ Changelog 是一项旨在减少 checkpointing 时间的功能，因此也可以
 
 开启 Changelog 功能之后，Flink 会不断上传状态变更并形成 changelog。创建 checkpoint 时，只有 changelog 
中的相关部分需要上传。而配置的状态后端则会定期在后台进行快照，快照成功上传后，相关的changelog 将会被截断。
 
-基于此，异步阶段的持续时间减少（另外因为不需要将数据刷新到磁盘，同步阶段持续时间也减少了），特别是长尾延迟得到了改善。
+基于此，异步阶段的持续时间减少（另外因为不需要将数据刷新到磁盘，同步阶段持续时间也减少了），特别是长尾延迟得到了改善。同时，还可以获得一些其他好处：
+1. 更稳定、更低的端到端时延。
+2. Failover 后数据重放更少。
+3. 资源利用更加稳定。
 
 但是，资源使用会变得更高：
 
 - 将会在 DFS 上创建更多文件
-- 将可能在 DFS 上残留更多文件（这将在 FLINK-25511 和 FLINK-25512 之后的新版本中被解决）
 - 将使用更多的 IO 带宽用来上传状态变更
 - 将使用更多 CPU 资源来序列化状态变更
 - Task Managers 将会使用更多内存来缓存状态变更
 
+值得注意的是虽然 Changelog 增加了少量的日常 CPU 和网络带宽资源使用，
+但会降低峰值的 CPU 和网络带宽使用量。
+
 另一项需要考虑的事情是恢复时间。取决于 `state.backend.changelog.periodic-materialize.interval` 
的设置，changelog 可能会变得冗长，因此重放会花费更多时间。即使这样，恢复时间加上 checkpoint 持续时间仍然可能低于不开启 
changelog 功能的时间，从而在故障恢复的情况下也能提供更低的端到端延迟。当然，取决于上述时间的实际比例，有效恢复时间也有可能会增加。
 
 有关更多详细信息，请参阅 
[FLIP-158](https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints)。
diff --git a/docs/content/docs/deployment/config.md 
b/docs/content/docs/deployment/config.md
index cbdc4f25a77..c4e70ba7235 100644
--- a/docs/content/docs/deployment/config.md
+++ b/docs/content/docs/deployment/config.md
@@ -372,8 +372,7 @@ Advanced options to tune RocksDB and RocksDB checkpoints.
 ### State Changelog Options
 
 Please refer to [State Backends]({{< ref 
"docs/ops/state/state_backends#enabling-changelog" >}}) for information on
-using State Changelog. {{< hint warning >}} The feature is in experimental 
status. {{< /hint >}} {{<
-generated/state_backend_changelog_section >}}
+using State Changelog.
 
 #### FileSystem-based Changelog options
 
diff --git a/docs/content/docs/ops/state/state_backends.md 
b/docs/content/docs/ops/state/state_backends.md
index b645eefcd8b..bd04491977f 100644
--- a/docs/content/docs/ops/state/state_backends.md
+++ b/docs/content/docs/ops/state/state_backends.md
@@ -346,10 +346,6 @@ Still not supported in Python API.
 
 ## Enabling Changelog
 
-{{< hint warning >}} This feature is in experimental status. {{< /hint >}}
-
-{{< hint warning >}} Enabling Changelog may have a negative performance impact 
on your application (see below). {{< /hint >}}
-
 ### Introduction
 
 Changelog is a feature that aims to decrease checkpointing time and, 
therefore, end-to-end latency in exactly-once mode.
@@ -361,7 +357,7 @@ Most commonly, checkpoint duration is affected by:
    and [Buffer debloating]({{< ref 
"docs/ops/state/checkpointing_under_backpressure#buffer-debloating" >}})
 2. Snapshot creation time (so-called synchronous phase), addressed by 
asynchronous snapshots (mentioned [above]({{<
    ref "#the-embeddedrocksdbstatebackend">}}))
-4. Snapshot upload time (asynchronous phase)
+3. Snapshot upload time (asynchronous phase)
 
 Upload time can be decreased by [incremental checkpoints]({{< ref 
"#incremental-checkpoints" >}}).
 However, most incremental state backends perform some form of compaction 
periodically, which results in re-uploading the
@@ -373,16 +369,21 @@ part of this changelog needs to be uploaded. The 
configured state backend is sna
 background periodically. Upon successful upload, the changelog is truncated.
 
 As a result, asynchronous phase duration is reduced, as well as synchronous 
phase - because no data needs to be flushed
-to disk. In particular, long-tail latency is improved.
+to disk. In particular, long-tail latency is improved. At the same time, some 
other benefits could be got:
+1. More Stable and Lower End-to-end Latency.
+2. Less Data Replay after Failover.
+3. More Stable Utilization of Resources.
 
 However, resource usage is higher:
 
 - more files are created on DFS
-- more files can be left undeleted DFS (this will be addressed in the future 
versions in FLINK-25511 and FLINK-25512)
 - more IO bandwidth is used to upload state changes
 - more CPU used to serialize state changes
 - more memory used by Task Managers to buffer state changes
 
+It is worth noting that changelog adds a small amount of daily CPU and network 
bandwidth resources, 
+but reduces peak CPU and network bandwidth usage.
+
 Recovery time is another thing to consider. Depending on the 
`state.backend.changelog.periodic-materialize.interval`
 setting, the changelog can become lengthy and replaying it may take more time. 
However, recovery time combined with
 checkpoint duration will likely still be lower than in non-changelog setups, 
providing lower end-to-end latency even in

(flink) branch master updated: [FLINK-34119][doc] Improve description about changelog in document

Reply via email to