This is an automated email from the ASF dual-hosted git repository. jark pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/master by this push: new d9f9037 [FLINK-11636][docs-zh] Translate "State Schema Evolution" page into Chinese d9f9037 is described below commit d9f9037b1508b81396d83aee72a90e89a9b1c6ad Author: yangfei5 <yangf...@xiaomi.com> AuthorDate: Tue Apr 30 17:11:03 2019 +0800 [FLINK-11636][docs-zh] Translate "State Schema Evolution" page into Chinese This closes #8319 --- docs/dev/stream/state/schema_evolution.zh.md | 91 ++++++++++++---------------- 1 file changed, 40 insertions(+), 51 deletions(-) diff --git a/docs/dev/stream/state/schema_evolution.zh.md b/docs/dev/stream/state/schema_evolution.zh.md index 69d1b5c..d02cc7a 100644 --- a/docs/dev/stream/state/schema_evolution.zh.md +++ b/docs/dev/stream/state/schema_evolution.zh.md @@ -1,5 +1,5 @@ --- -title: "State Schema Evolution" +title: "状态数据结构升级" nav-parent_id: streaming_state nav-pos: 6 --- @@ -25,17 +25,16 @@ under the License. * ToC {:toc} -Apache Flink streaming applications are typically designed to run indefinitely or for long periods of time. -As with all long-running services, the applications need to be updated to adapt to changing requirements. -This goes the same for data schemas that the applications work against; they evolve along with the application. +Apache Flink 流应用通常被设计为永远或者长时间运行。 +与所有长期运行的服务一样,应用程序需要随着业务的迭代而进行调整。 +应用所处理的数据 schema 也会随着进行变化。 -This page provides an overview of how you can evolve your state type's data schema. -The current restrictions varies across different types and state structures (`ValueState`, `ListState`, etc.). +此页面概述了如何升级状态类型的数据 schema 。 +目前对不同类型的状态结构(`ValueState`、`ListState` 等)有不同的限制 -Note that the information on this page is relevant only if you are using state serializers that are -generated by Flink's own [type serialization framework]({{ site.baseurl }}/dev/types_serialization.html). -That is, when declaring your state, the provided state descriptor is not configured to use a specific `TypeSerializer` -or `TypeInformation`, in which case Flink infers information about the state type: +请注意,此页面的信息只与 Flink 自己生成的状态序列化器相关 [类型序列化框架]({{ site.baseurl }}/zh/dev/types_serialization.html)。 +也就是说,在声明状态时,状态描述符不可以配置为使用特定的 TypeSerializer 或 TypeInformation , +在这种情况下,Flink 会推断状态类型的信息: <div data-lang="java" markdown="1"> {% highlight java %} @@ -48,62 +47,52 @@ checkpointedState = getRuntimeContext().getListState(descriptor); {% endhighlight %} </div> -Under the hood, whether or not the schema of state can be evolved depends on the serializer used to read / write -persisted state bytes. Simply put, a registered state's schema can only be evolved if its serializer properly -supports it. This is handled transparently by serializers generated by Flink's type serialization framework -(current scope of support is listed [below]({{ site.baseurl }}/dev/stream/state/schema_evolution.html#supported-data-types-for-schema-evolution)). +在内部,状态是否可以进行升级取决于用于读写持久化状态字节的序列化器。 +简而言之,状态数据结构只有在其序列化器正确支持时才能升级。 +这一过程是被 Flink 的类型序列化框架生成的序列化器透明处理的([下面]({{ site.baseurl }}/zh/dev/stream/state/schema_evolution.html#数据结构升级支持的数据类型) 列出了当前的支持范围)。 -If you intend to implement a custom `TypeSerializer` for your state type and would like to learn how to implement -the serializer to support state schema evolution, please refer to -[Custom State Serialization]({{ site.baseurl }}/dev/stream/state/custom_serialization.html). -The documentation there also covers necessary internal details about the interplay between state serializers and Flink's -state backends to support state schema evolution. +如果你想要为你的状态类型实现自定义的 `TypeSerializer` 并且想要学习如何实现支持状态数据结构升级的序列化器, +可以参考 [自定义状态序列化器]({{ site.baseurl }}/zh/dev/stream/state/custom_serialization.html)。 +本文档也包含一些用于支持状态数据结构升级的状态序列化器与 Flink 状态后端存储相互作用的必要内部细节。 -## Evolving state schema +## 升级状态数据结构 -To evolve the schema of a given state type, you would take the following steps: +为了对给定的状态类型进行升级,你需要采取以下几个步骤: - 1. Take a savepoint of your Flink streaming job. - 2. Update state types in your application (e.g., modifying your Avro type schema). - 3. Restore the job from the savepoint. When accessing state for the first time, Flink will assess whether or not - the schema had been changed for the state, and migrate state schema if necessary. + 1. 对 Flink 流作业进行 savepoint 操作。 + 2. 升级程序中的状态类型(例如:修改你的 Avro 结构)。 + 3. 从 savepoint 恢复作业。当第一次访问状态数据时,Flink 会判断状态数据 schema 是否已经改变,并进行必要的迁移。 -The process of migrating state to adapt to changed schemas happens automatically, and independently for each state. -This process is performed internally by Flink by first checking if the new serializer for the state has different -serialization schema than the previous serializer; if so, the previous serializer is used to read the state to objects, -and written back to bytes again with the new serializer. +用来适应状态结构的改变而进行的状态迁移过程是自动发生的,并且状态之间是互相独立的。 +Flink 内部是这样来进行处理的,首先会检查新的序列化器相对比之前的序列化器是否有不同的状态结构;如果有, +那么之前的序列化器用来读取状态数据字节到对象,然后使用新的序列化器将对象回写为字节。 -Further details about the migration process is out of the scope of this documentation; please refer to -[here]({{ site.baseurl }}/dev/stream/state/custom_serialization.html). +更多的迁移过程细节不在本文档谈论的范围;可以参考[文档]({{ site.baseurl }}/zh/dev/stream/state/custom_serialization.html)。 -## Supported data types for schema evolution +## 数据结构升级支持的数据类型 -Currently, schema evolution is supported only for POJO and Avro types. Therefore, if you care about schema evolution for -state, it is currently recommended to always use either Pojo or Avro for state data types. +目前,仅支持 POJO 和 Avro 类型的 schema 升级 +因此,如果你比较关注于状态数据结构的升级,那么目前来看强烈推荐使用 Pojo 或者 Avro 状态数据类型。 -There are plans to extend the support for more composite types; for more details, -please refer to [FLINK-10896](https://issues.apache.org/jira/browse/FLINK-10896). +我们有计划支持更多的复合类型;更多的细节可以参考 [FLINK-10896](https://issues.apache.org/jira/browse/FLINK-10896)。 -### POJO types +### POJO 类型 -Flink supports evolving schema of [POJO types]({{ site.baseurl }}/dev/types_serialization.html#rules-for-pojo-types), -based on the following set of rules: +Flink 基于下面的规则来支持 [POJO 类型]({{ site.baseurl }}/zh/dev/types_serialization.html#pojo-类型的规则)结构的升级: - 1. Fields can be removed. Once removed, the previous value for the removed field will be dropped in future checkpoints and savepoints. - 2. New fields can be added. The new field will be initialized to the default value for its type, as - [defined by Java](https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html). - 3. Declared fields types cannot change. - 4. Class name of the POJO type cannot change, including the namespace of the class. + 1. 可以删除字段。一旦删除,被删除字段的前值将会在将来的 checkpoints 以及 savepoints 中删除。 + 2. 可以添加字段。新字段会使用类型对应的默认值进行初始化,比如 [Java 类型](https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html)。 + 3. 不可以修改字段的声明类型。 + 4. 不可以改变 POJO 类型的类名,包括类的命名空间。 -Note that the schema of POJO type state can only be evolved when restoring from a previous savepoint with Flink versions -newer than 1.8.0. When restoring with Flink versions older than 1.8.0, the schema cannot be changed. +需要注意,只有从 1.8.0 及以上版本的 Flink 生产的 savepoint 进行恢复时,POJO 类型的状态才可以进行升级。 +对 1.8.0 版本之前的 Flink 是没有办法进行 POJO 类型升级的。 -### Avro types +### Avro 类型 -Flink fully supports evolving schema of Avro type state, as long as the schema change is considered compatible by -[Avro's rules for schema resolution](http://avro.apache.org/docs/current/spec.html#Schema+Resolution). +Flink 完全支持 Avro 状态类型的升级,只要数据结构的修改是被 +[Avro 的数据结构解析规则](http://avro.apache.org/docs/current/spec.html#Schema+Resolution)认为兼容的即可。 -One limitation is that Avro generated classes used as the state type cannot be relocated or have different -namespaces when the job is restored. +一个例外是如果新的 Avro 数据 schema 生成的类无法被重定位或者使用了不同的命名空间,在作业恢复时状态数据会被认为是不兼容的。 {% top %}