[GitHub] [flink] klion26 commented on a change in pull request #12139: [FLINK-16076] Translate "Queryable State" page into Chinese

GitBox Mon, 18 May 2020 19:57:18 -0700


klion26 commented on a change in pull request #12139:
URL: https://github.com/apache/flink/pull/12139#discussion_r426997228




##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -27,75 +27,54 @@ under the License.
 {:toc}
 
 <div class="alert alert-warning">
-  <strong>Note:</strong> The client APIs for queryable state are currently in 
an evolving state and
-  there are <strong>no guarantees</strong> made about stability of the 
provided interfaces. It is
-  likely that there will be breaking API changes on the client side in the 
upcoming Flink versions.
+  <strong>注意:</strong> 目前 querable state 的客户端 API 
还在不断演进，<strong>不保证</strong>现有接口的稳定性。在后续的 Flink 版本中有可能发生 API 变化。
 </div>
 
-In a nutshell, this feature exposes Flink's managed keyed (partitioned) state
-(see [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) to 
the outside world and
-allows the user to query a job's state from outside Flink. For some scenarios, 
queryable state
-eliminates the need for distributed operations/transactions with external 
systems such as key-value
-stores which are often the bottleneck in practice. In addition, this feature 
may be particularly
-useful for debugging purposes.
+简而言之, 这个特性将 Flink 的 managed keyed (partitioned) state
+(参考 [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) 
暴露给外部，从而用户可以在 Flink 外部查询作业 state。
+在某些场景中，Queryable State 消除了对外部系统的分布式操作以及事务的需求，比如 KV 
存储系统，而这些外部系统往往会成为瓶颈。除此之外，这个特性对于调试作业非常有用。
 
 <div class="alert alert-warning">
-  <strong>Attention:</strong> When querying a state object, that object is 
accessed from a concurrent
-  thread without any synchronization or copying. This is a design choice, as 
any of the above would lead
-  to increased job latency, which we wanted to avoid. Since any state backend 
using Java heap space,
-  <i>e.g.</i> <code>MemoryStateBackend</code> or <code>FsStateBackend</code>, 
does not work
-  with copies when retrieving values but instead directly references the 
stored values, read-modify-write
-  patterns are unsafe and may cause the queryable state server to fail due to 
concurrent modifications.
-  The <code>RocksDBStateBackend</code> is safe from these issues.
+  <strong>注意:</strong> 进行查询时，state 会在并发线程中被访问，但 state 
不会进行同步和拷贝。这种设计是为了避免同步和拷贝带来的作业延时。对于使用 Java 堆内存的 state backend，
+  <i>比如</i> <code>MemoryStateBackend</code> 或者 
<code>FsStateBackend</code>，它们获取状态时不会进行拷贝，而是直接引用状态对象，所以对状态的 read-modify-write 
是不安全的，并且
+ 可能会因为并发修改导致查询失败。但 <code>RocksDBStateBackend</code> 是安全的，不会遇到上述问题。
 </div>
 
-## Architecture
+## 架构
 
-Before showing how to use the Queryable State, it is useful to briefly 
describe the entities that compose it.
-The Queryable State feature consists of three main entities:
+在展示如何使用 Queryable State 之前，先简单描述一下该特性的组成部分，主要包括以下三部分:
 
- 1. the `QueryableStateClient`, which (potentially) runs outside the Flink 
cluster and submits the user queries,
- 2. the `QueryableStateClientProxy`, which runs on each `TaskManager` (*i.e.* 
inside the Flink cluster) and is responsible
- for receiving the client's queries, fetching the requested state from the 
responsible Task Manager on his behalf, and
- returning it to the client, and
- 3. the `QueryableStateServer` which runs on each `TaskManager` and is 
responsible for serving the locally stored state.
+ 1. `QueryableStateClient`，默认运行在 Flink 集群外部，负责提交用户的查询请求；
+ 2. `QueryableStateClientProxy`，运行在每个 `TaskManager` 上(*即* Flink 
集群内部)，负责接收客户端的查询请求，从所负责的 Task Manager 获取请求的 state，并返回给客户端；
+ 3. `QueryableStateServer`, 运行在 `TaskManager` 上，负责服务本地存储的 state。
 
-The client connects to one of the proxies and sends a request for the state 
associated with a specific
-key, `k`. As stated in [Working with State]({{ site.baseurl 
}}/dev/stream/state/state.html), keyed state is organized in
-*Key Groups*, and each `TaskManager` is assigned a number of these key groups. 
To discover which `TaskManager` is
-responsible for the key group holding `k`, the proxy will ask the 
`JobManager`. Based on the answer, the proxy will
-then query the `QueryableStateServer` running on that `TaskManager` for the 
state associated with `k`, and forward the
-response back to the client.
+客户端连接到一个代理，并发送请求获取特定 `k` 对应的 state。 如 [Working with State]({{ site.baseurl 
}}/dev/stream/state/state.html)所述，keyed state 按照
+*Key Groups* 进行划分，每个 `TaskManager` 会分配其中的一些 key groups。代理会询问 `JobManager` 以找到 
`k` 所属 key group 的 TaskManager。根据返回的结果, 代理
+将会向运行在 `TaskManager` 上的 `QueryableStateServer` 查询 `k` 对应的 state， 并将结果返回给客户端。
 
-## Activating Queryable State
+## 激活 Queryable State
 
-To enable queryable state on your Flink cluster, you need to do the following:
+为了在 Flink 集群上使用 queryable state，需要进行以下操作：
 
- 1. copy the `flink-queryable-state-runtime{{ site.scala_version_suffix 
}}-{{site.version }}.jar`
-from the `opt/` folder of your [Flink 
distribution](https://flink.apache.org/downloads.html "Apache Flink: 
Downloads"),
-to the `lib/` folder.
- 2. set the property `queryable-state.enable` to `true`. See the 
[Configuration]({{ site.baseurl }}/ops/config.html#queryable-state) 
documentation for details and additional parameters.
+ 1. 将 `flink-queryable-state-runtime{{ site.scala_version_suffix 
}}-{{site.version }}.jar`
+从 [Flink distribution](https://flink.apache.org/downloads.html "Apache Flink: 
Downloads") 的 `opt/` 目录拷贝到 `lib/` 目录；
+ 2. 将参数 `queryable-state.enable` 设置为 `true`。详细信息以及其它配置可参考文档 [Configuration]({{ 
site.baseurl }}/ops/config.html#queryable-state)。
 
-To verify that your cluster is running with queryable state enabled, check the 
logs of any 
-task manager for the line: `"Started the Queryable State Proxy Server @ ..."`.
+为了验证集群的 queryable stat 已经被激活，可以检查任意 task manager 的日志中是否包含 "Started the 
Queryable State Proxy Server @ ..."。

Review comment:
       ```suggestion
   为了验证集群的 queryable state 已经被激活，可以检查任意 task manager 的日志中是否包含 "Started the 
Queryable State Proxy Server @ ..."。
   ```

##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -180,18 +154,16 @@ jar which must be explicitly included as a dependency in 
the `pom.xml` of your p
 {% endhighlight %}
 </div>
 
-For more on this, you can check how to [set up a Flink program]({{ 
site.baseurl }}/dev/projectsetup/dependencies.html).
+关于依赖的更多信息, 可以参考如何[配置Flink项目]({{ site.baseurl 
}}/zh/dev/projectsetup/dependencies.html).

Review comment:
       ```suggestion
   关于依赖的更多信息, 可以参考如何[配置 Flink 项目]({{ site.baseurl 
}}/zh/dev/projectsetup/dependencies.html).
   ```

##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -27,75 +27,54 @@ under the License.
 {:toc}
 
 <div class="alert alert-warning">
-  <strong>Note:</strong> The client APIs for queryable state are currently in 
an evolving state and
-  there are <strong>no guarantees</strong> made about stability of the 
provided interfaces. It is
-  likely that there will be breaking API changes on the client side in the 
upcoming Flink versions.
+  <strong>注意:</strong> 目前 querable state 的客户端 API 
还在不断演进，<strong>不保证</strong>现有接口的稳定性。在后续的 Flink 版本中有可能发生 API 变化。
 </div>
 
-In a nutshell, this feature exposes Flink's managed keyed (partitioned) state
-(see [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) to 
the outside world and
-allows the user to query a job's state from outside Flink. For some scenarios, 
queryable state
-eliminates the need for distributed operations/transactions with external 
systems such as key-value
-stores which are often the bottleneck in practice. In addition, this feature 
may be particularly
-useful for debugging purposes.
+简而言之, 这个特性将 Flink 的 managed keyed (partitioned) state
+(参考 [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) 
暴露给外部，从而用户可以在 Flink 外部查询作业 state。
+在某些场景中，Queryable State 消除了对外部系统的分布式操作以及事务的需求，比如 KV 
存储系统，而这些外部系统往往会成为瓶颈。除此之外，这个特性对于调试作业非常有用。
 
 <div class="alert alert-warning">
-  <strong>Attention:</strong> When querying a state object, that object is 
accessed from a concurrent
-  thread without any synchronization or copying. This is a design choice, as 
any of the above would lead
-  to increased job latency, which we wanted to avoid. Since any state backend 
using Java heap space,
-  <i>e.g.</i> <code>MemoryStateBackend</code> or <code>FsStateBackend</code>, 
does not work
-  with copies when retrieving values but instead directly references the 
stored values, read-modify-write
-  patterns are unsafe and may cause the queryable state server to fail due to 
concurrent modifications.
-  The <code>RocksDBStateBackend</code> is safe from these issues.
+  <strong>注意:</strong> 进行查询时，state 会在并发线程中被访问，但 state 
不会进行同步和拷贝。这种设计是为了避免同步和拷贝带来的作业延时。对于使用 Java 堆内存的 state backend，
+  <i>比如</i> <code>MemoryStateBackend</code> 或者 
<code>FsStateBackend</code>，它们获取状态时不会进行拷贝，而是直接引用状态对象，所以对状态的 read-modify-write 
是不安全的，并且
+ 可能会因为并发修改导致查询失败。但 <code>RocksDBStateBackend</code> 是安全的，不会遇到上述问题。

Review comment:
       这里不需要空行，空行会多加一个空格

##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -27,75 +27,54 @@ under the License.
 {:toc}
 
 <div class="alert alert-warning">
-  <strong>Note:</strong> The client APIs for queryable state are currently in 
an evolving state and
-  there are <strong>no guarantees</strong> made about stability of the 
provided interfaces. It is
-  likely that there will be breaking API changes on the client side in the 
upcoming Flink versions.
+  <strong>注意:</strong> 目前 querable state 的客户端 API 
还在不断演进，<strong>不保证</strong>现有接口的稳定性。在后续的 Flink 版本中有可能发生 API 变化。
 </div>
 
-In a nutshell, this feature exposes Flink's managed keyed (partitioned) state
-(see [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) to 
the outside world and
-allows the user to query a job's state from outside Flink. For some scenarios, 
queryable state
-eliminates the need for distributed operations/transactions with external 
systems such as key-value
-stores which are often the bottleneck in practice. In addition, this feature 
may be particularly
-useful for debugging purposes.
+简而言之, 这个特性将 Flink 的 managed keyed (partitioned) state
+(参考 [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) 
暴露给外部，从而用户可以在 Flink 外部查询作业 state。
+在某些场景中，Queryable State 消除了对外部系统的分布式操作以及事务的需求，比如 KV 
存储系统，而这些外部系统往往会成为瓶颈。除此之外，这个特性对于调试作业非常有用。
 
 <div class="alert alert-warning">
-  <strong>Attention:</strong> When querying a state object, that object is 
accessed from a concurrent
-  thread without any synchronization or copying. This is a design choice, as 
any of the above would lead
-  to increased job latency, which we wanted to avoid. Since any state backend 
using Java heap space,
-  <i>e.g.</i> <code>MemoryStateBackend</code> or <code>FsStateBackend</code>, 
does not work
-  with copies when retrieving values but instead directly references the 
stored values, read-modify-write
-  patterns are unsafe and may cause the queryable state server to fail due to 
concurrent modifications.
-  The <code>RocksDBStateBackend</code> is safe from these issues.
+  <strong>注意:</strong> 进行查询时，state 会在并发线程中被访问，但 state 
不会进行同步和拷贝。这种设计是为了避免同步和拷贝带来的作业延时。对于使用 Java 堆内存的 state backend，
+  <i>比如</i> <code>MemoryStateBackend</code> 或者 
<code>FsStateBackend</code>，它们获取状态时不会进行拷贝，而是直接引用状态对象，所以对状态的 read-modify-write 
是不安全的，并且
+ 可能会因为并发修改导致查询失败。但 <code>RocksDBStateBackend</code> 是安全的，不会遇到上述问题。
 </div>
 
-## Architecture
+## 架构
 
-Before showing how to use the Queryable State, it is useful to briefly 
describe the entities that compose it.
-The Queryable State feature consists of three main entities:
+在展示如何使用 Queryable State 之前，先简单描述一下该特性的组成部分，主要包括以下三部分:
 
- 1. the `QueryableStateClient`, which (potentially) runs outside the Flink 
cluster and submits the user queries,
- 2. the `QueryableStateClientProxy`, which runs on each `TaskManager` (*i.e.* 
inside the Flink cluster) and is responsible
- for receiving the client's queries, fetching the requested state from the 
responsible Task Manager on his behalf, and
- returning it to the client, and
- 3. the `QueryableStateServer` which runs on each `TaskManager` and is 
responsible for serving the locally stored state.
+ 1. `QueryableStateClient`，默认运行在 Flink 集群外部，负责提交用户的查询请求；
+ 2. `QueryableStateClientProxy`，运行在每个 `TaskManager` 上(*即* Flink 
集群内部)，负责接收客户端的查询请求，从所负责的 Task Manager 获取请求的 state，并返回给客户端；
+ 3. `QueryableStateServer`, 运行在 `TaskManager` 上，负责服务本地存储的 state。
 
-The client connects to one of the proxies and sends a request for the state 
associated with a specific
-key, `k`. As stated in [Working with State]({{ site.baseurl 
}}/dev/stream/state/state.html), keyed state is organized in
-*Key Groups*, and each `TaskManager` is assigned a number of these key groups. 
To discover which `TaskManager` is
-responsible for the key group holding `k`, the proxy will ask the 
`JobManager`. Based on the answer, the proxy will
-then query the `QueryableStateServer` running on that `TaskManager` for the 
state associated with `k`, and forward the
-response back to the client.
+客户端连接到一个代理，并发送请求获取特定 `k` 对应的 state。 如 [Working with State]({{ site.baseurl 
}}/dev/stream/state/state.html)所述，keyed state 按照

Review comment:
       ```suggestion
   客户端连接到一个代理，并发送请求获取特定 `k` 对应的 state。 如 [Working with State]({{ site.baseurl 
}}/zh/dev/stream/state/state.html) 所述，keyed state 按照
   ```

##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -119,28 +98,25 @@ QueryableStateStream asQueryableState(
 
 
 <div class="alert alert-info">
-  <strong>Note:</strong> There is no queryable <code>ListState</code> sink as 
it would result in an ever-growing
-  list which may not be cleaned up and thus will eventually consume too much 
memory.
+  <strong>注意:</strong> 没有可查询的 <code>ListState</code> sink，因为这种情况下 list 
会不断增长，并且可能不会被清理，最终会消耗大量的内存。
 </div>
 
-The returned `QueryableStateStream` can be seen as a sink and **cannot** be 
further transformed. Internally, a 
-`QueryableStateStream` gets translated to an operator which uses all incoming 
records to update the queryable state 
-instance. The updating logic is implied by the type of the `StateDescriptor` 
provided in the `asQueryableState` call. 
-In a program like the following, all records of the keyed stream will be used 
to update the state instance via the 
-`ValueState.update(value)`:
+返回的 `QueryableStateStream` 可以被视作一个sink，而且**不能再**被进一步转换。在内部实现上，一个 
`QueryableStateStream` 被转换成一个 operator，
+使用输入的数据来更新 queryable state。state 如何更新是由 `asQueryableState` 提供的 
`StateDescriptor` 来决定的。在下面的代码中, keyed stream 的所有数据
+将会通过 `ValueState.update(value)` 来更新状态：

Review comment:
       这里不需要换行

##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -150,20 +126,18 @@ descriptor.setQueryable("query-name"); // queryable state 
name
 {% endhighlight %}
 
 <div class="alert alert-info">
-  <strong>Note:</strong> The <code>queryableStateName</code> parameter may be 
chosen arbitrarily and is only
-  used for queries. It does not have to be identical to the state's own name.
+  <strong>注意:</strong> 参数 <code>queryableStateName</code> 
可以任意选取，并且只被用来进行查询，它可以和 state 的名称不同。
 </div>
 
-This variant has no limitations as to which type of state can be made 
queryable. This means that this can be used for 
-any `ValueState`, `ReduceState`, `ListState`, `MapState`, `AggregatingState`, 
and the currently deprecated `FoldingState`.
+这种方式不会限制 state 类型，即任意的 
`ValueState`、`ReduceState`、`ListState`、`MapState`、`AggregatingState` 以及已弃用的 
`FoldingState` 
+均可作为 queryable state。
 
-## Querying State
+## 查询 state
 
-So far, you have set up your cluster to run with queryable state and you have 
declared (some of) your state as
-queryable. Now it is time to see how to query this state. 
+目前为止，你已经激活了集群的 queryable state 功能，并且将一些 state 设置成了可查询的，接下来将会展示如何进行查询。
 
-For this you can use the `QueryableStateClient` helper class. This is 
available in the `flink-queryable-state-client` 
-jar which must be explicitly included as a dependency in the `pom.xml` of your 
project along with `flink-core`, as shown below:
+为了进行查询，可以使用辅助类 `QueryableStateClient`，这个类位于 `flink-queryable-state-client` 
的jar中，在项目的 `pom.xml` 需要显示添加

Review comment:
       ```suggestion
   为了进行查询，可以使用辅助类 `QueryableStateClient`，这个类位于 `flink-queryable-state-client` 的 
jar 中，在项目的 `pom.xml` 需要显示添加
   ```

##########
File path: docs/dev/stream/state/queryable_state.zh.md
##########
@@ -27,75 +27,54 @@ under the License.
 {:toc}
 
 <div class="alert alert-warning">
-  <strong>Note:</strong> The client APIs for queryable state are currently in 
an evolving state and
-  there are <strong>no guarantees</strong> made about stability of the 
provided interfaces. It is
-  likely that there will be breaking API changes on the client side in the 
upcoming Flink versions.
+  <strong>注意:</strong> 目前 querable state 的客户端 API 
还在不断演进，<strong>不保证</strong>现有接口的稳定性。在后续的 Flink 版本中有可能发生 API 变化。
 </div>
 
-In a nutshell, this feature exposes Flink's managed keyed (partitioned) state
-(see [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) to 
the outside world and
-allows the user to query a job's state from outside Flink. For some scenarios, 
queryable state
-eliminates the need for distributed operations/transactions with external 
systems such as key-value
-stores which are often the bottleneck in practice. In addition, this feature 
may be particularly
-useful for debugging purposes.
+简而言之, 这个特性将 Flink 的 managed keyed (partitioned) state
+(参考 [Working with State]({{ site.baseurl }}/dev/stream/state/state.html)) 
暴露给外部，从而用户可以在 Flink 外部查询作业 state。
+在某些场景中，Queryable State 消除了对外部系统的分布式操作以及事务的需求，比如 KV 
存储系统，而这些外部系统往往会成为瓶颈。除此之外，这个特性对于调试作业非常有用。
 
 <div class="alert alert-warning">
-  <strong>Attention:</strong> When querying a state object, that object is 
accessed from a concurrent
-  thread without any synchronization or copying. This is a design choice, as 
any of the above would lead
-  to increased job latency, which we wanted to avoid. Since any state backend 
using Java heap space,
-  <i>e.g.</i> <code>MemoryStateBackend</code> or <code>FsStateBackend</code>, 
does not work
-  with copies when retrieving values but instead directly references the 
stored values, read-modify-write
-  patterns are unsafe and may cause the queryable state server to fail due to 
concurrent modifications.
-  The <code>RocksDBStateBackend</code> is safe from these issues.
+  <strong>注意:</strong> 进行查询时，state 会在并发线程中被访问，但 state 
不会进行同步和拷贝。这种设计是为了避免同步和拷贝带来的作业延时。对于使用 Java 堆内存的 state backend，
+  <i>比如</i> <code>MemoryStateBackend</code> 或者 
<code>FsStateBackend</code>，它们获取状态时不会进行拷贝，而是直接引用状态对象，所以对状态的 read-modify-write 
是不安全的，并且
+ 可能会因为并发修改导致查询失败。但 <code>RocksDBStateBackend</code> 是安全的，不会遇到上述问题。
 </div>
 
-## Architecture
+## 架构
 
-Before showing how to use the Queryable State, it is useful to briefly 
describe the entities that compose it.
-The Queryable State feature consists of three main entities:
+在展示如何使用 Queryable State 之前，先简单描述一下该特性的组成部分，主要包括以下三部分:
 
- 1. the `QueryableStateClient`, which (potentially) runs outside the Flink 
cluster and submits the user queries,
- 2. the `QueryableStateClientProxy`, which runs on each `TaskManager` (*i.e.* 
inside the Flink cluster) and is responsible
- for receiving the client's queries, fetching the requested state from the 
responsible Task Manager on his behalf, and
- returning it to the client, and
- 3. the `QueryableStateServer` which runs on each `TaskManager` and is 
responsible for serving the locally stored state.
+ 1. `QueryableStateClient`，默认运行在 Flink 集群外部，负责提交用户的查询请求；
+ 2. `QueryableStateClientProxy`，运行在每个 `TaskManager` 上(*即* Flink 
集群内部)，负责接收客户端的查询请求，从所负责的 Task Manager 获取请求的 state，并返回给客户端；
+ 3. `QueryableStateServer`, 运行在 `TaskManager` 上，负责服务本地存储的 state。
 
-The client connects to one of the proxies and sends a request for the state 
associated with a specific
-key, `k`. As stated in [Working with State]({{ site.baseurl 
}}/dev/stream/state/state.html), keyed state is organized in
-*Key Groups*, and each `TaskManager` is assigned a number of these key groups. 
To discover which `TaskManager` is
-responsible for the key group holding `k`, the proxy will ask the 
`JobManager`. Based on the answer, the proxy will
-then query the `QueryableStateServer` running on that `TaskManager` for the 
state associated with `k`, and forward the
-response back to the client.
+客户端连接到一个代理，并发送请求获取特定 `k` 对应的 state。 如 [Working with State]({{ site.baseurl 
}}/dev/stream/state/state.html)所述，keyed state 按照
+*Key Groups* 进行划分，每个 `TaskManager` 会分配其中的一些 key groups。代理会询问 `JobManager` 以找到 
`k` 所属 key group 的 TaskManager。根据返回的结果, 代理
+将会向运行在 `TaskManager` 上的 `QueryableStateServer` 查询 `k` 对应的 state， 并将结果返回给客户端。

Review comment:
       这里不需要换行




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] klion26 commented on a change in pull request #12139: [FLINK-16076] Translate "Queryable State" page into Chinese

Reply via email to