This is an automated email from the ASF dual-hosted git repository. gaoyunhaii pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/master by this push: new c9fb92c [FLINK-25198][docs] Add doc about name and description of operator c9fb92c is described below commit c9fb92c21f1cd5fa7973e5322eb5e3e821b88bcd Author: 龙三 <wenlong....@alibaba-inc.com> AuthorDate: Fri Jan 14 16:46:10 2022 +0800 [FLINK-25198][docs] Add doc about name and description of operator This closes #18400. --- .../docs/dev/datastream/operators/overview.md | 36 +++++++++++++++++++ .../docs/dev/datastream/operators/overview.md | 40 ++++++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/docs/content.zh/docs/dev/datastream/operators/overview.md b/docs/content.zh/docs/dev/datastream/operators/overview.md index a3b5c8e..72017f4 100644 --- a/docs/content.zh/docs/dev/datastream/operators/overview.md +++ b/docs/content.zh/docs/dev/datastream/operators/overview.md @@ -755,3 +755,39 @@ some_stream.filter(...).slot_sharing_group("name") ``` {{< /tab >}} {{< /tabs>}} + +## 名字和描述 + +Flink里的算子和作业节点会有一个名字和一个描述。名字和描述。名字和描述都是用来介绍一个算子或者节点是在做什么操作,但是他们会被用在不同地方。 + +名字会用在用户界面、线程名、日志、指标等场景。节点的名字会根据节点中算子的名字来构建。 +名字需要尽可能的简洁,避免对外部系统产生大的压力。 + +描述主要用在执行计划展示,以及用户界面展示。节点的描述同样是根据节点中算子的描述来构建。 +描述可以包括详细的算子行为的信息,以便我们在运行时进行debug分析。 + +{{< tabs namedescription>}} +{{< tab "Java" >}} +```java +someStream.filter(...).setName("filter").setDescription("x in (1, 2, 3, 4) and y > 1") +``` +{{< /tab >}} +{{< tab "Scala" >}} +```scala +someStream.filter(...).setName("filter").setDescription("x in (1, 2, 3, 4) and y > 1") +``` +{{< /tab >}} +{{< tab "Python" >}} +```python +some_stream.filter(...).name("filter").set_description("x in (1, 2, 3, 4) and y > 1") +``` +{{< /tab >}} +{{< /tabs>}} + +节点的描述默认是按照一个多行的树形结构来构建的,用户可以通过把`pipeline.vertex-description-mode`设为`CASCADING`, 实现将描述改为老版本的单行递归模式。 + +Flink SQL框架生成的算子默认会有一个由算子的类型以及id构成的名字,以及一个带有详细信息的描述。 +用户可以通过将`table.optimizer.simplify-operator-name-enabled`设为`false`,将名字改为和以前的版本一样的详细描述。 + +当一个作业的拓扑很复杂时,用户可以把`pipeline.vertex-name-include-index-prefix`设为`true`,在节点的名字前增加一个拓扑序的前缀,这样就可以很容易根据指标以及日志的信息快速找到拓扑图中对应节点。 + diff --git a/docs/content/docs/dev/datastream/operators/overview.md b/docs/content/docs/dev/datastream/operators/overview.md index 91c5ac0..5744374 100644 --- a/docs/content/docs/dev/datastream/operators/overview.md +++ b/docs/content/docs/dev/datastream/operators/overview.md @@ -757,3 +757,43 @@ some_stream.filter(...).slot_sharing_group("name") ``` {{< /tab >}} {{< /tabs>}} + +## Name And Description +Operators and job vertices in flink have a name and a description. +Both name and description are introduction about what an operator or a job vertex is doing, but they are used differently. + +The name of operator and job vertex will be used in web ui, thread name, logging, metrics, etc. +The name of a job vertex is constructed based on the name of operators in it. +The name needs to be as concise as possible to avoid high pressure on external systems. + +The description will be used in the execution plan and displayed as the details of a job vertex in web UI. +The description of a job vertex is constructed based on the description of operators in it. +The description can contain detail information about operators to facilitate debugging at runtime. + +{{< tabs namedescription >}} +{{< tab "Java" >}} +```java +someStream.filter(...).setName("filter").setDescription("x in (1, 2, 3, 4) and y > 1") +``` +{{< /tab >}} +{{< tab "Scala" >}} +```scala +someStream.filter(...).setName("filter").setDescription("x in (1, 2, 3, 4) and y > 1") +``` +{{< /tab >}} +{{< tab "Python" >}} +```python +some_stream.filter(...).name("filter").set_description("x in (1, 2, 3, 4) and y > 1") +``` +{{< /tab >}} +{{< /tabs>}} + +The format of description of a job vertex is a tree format string by default. +Users can set `pipeline.vertex-description-mode` to `CASCADING`, if they want to set description to be the cascading format as in former versions. + +Operators generated by Flink SQL will have a name consisted by type of operator and id, and a detailed description, by default. +Users can set `table.optimizer.simplify-operator-name-enabled` to `false`, if they want to set name to be the detailed description as in former versions. + +When the topology of the pipeline is complex, users can add a topological index in the name of vertex by set `pipeline.vertex-name-include-index-prefix` to `true`, +so that we can easily find the vertex in the graph according to logs or metrics tags. +