This is an automated email from the ASF dual-hosted git repository.
chufenggao pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler.git
The following commit(s) were added to refs/heads/dev by this push:
new 938f13b568 [Doc] Spark on K8S (#13605)
938f13b568 is described below
commit 938f13b568d2bb667bb3b00127f29295ed0c77c4
Author: Aaron Wang <[email protected]>
AuthorDate: Thu Feb 23 10:31:34 2023 +0800
[Doc] Spark on K8S (#13605)
---
docs/docs/en/guide/task/spark.md | 35 ++++++++++++++++++-----------------
docs/docs/zh/guide/task/spark.md | 5 +++--
2 files changed, 21 insertions(+), 19 deletions(-)
diff --git a/docs/docs/en/guide/task/spark.md b/docs/docs/en/guide/task/spark.md
index ecd328c7d5..3d83d967e1 100644
--- a/docs/docs/en/guide/task/spark.md
+++ b/docs/docs/en/guide/task/spark.md
@@ -20,23 +20,24 @@ Spark task type for executing Spark application. When
executing the Spark task,
- Please refer to [DolphinScheduler Task Parameters Appendix](appendix.md)
`Default Task Parameters` section for default parameters.
-| **Parameter** |
**Description**
|
-|----------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|
-| Program type | Supports Java, Scala, Python, and SQL.
|
-| The class of main function | The **full path** of Main Class, the entry
point of the Spark program.
|
-| Main jar package | The Spark jar package (upload by Resource
Center).
|
-| SQL scripts | SQL statements in .sql files that Spark sql
runs.
|
-| Deployment mode | <ul><li>spark submit supports three modes:
yarn-clusetr, yarn-client and local.</li><li>spark sql supports yarn-client and
local modes.</li></ul> |
-| Task name | Spark task name.
|
-| Driver core number | Set the number of Driver core, which can be set
according to the actual production environment.
|
-| Driver memory size | Set the size of Driver memories, which can be
set according to the actual production environment.
|
-| Number of Executor | Set the number of Executor, which can be set
according to the actual production environment.
|
-| Executor memory size | Set the size of Executor memories, which can be
set according to the actual production environment.
|
-| Main program parameters | Set the input parameters of the Spark program
and support the substitution of custom parameter variables.
|
-| Optional parameters | Support `--jars`, `--files`,` --archives`,
`--conf` format.
|
-| Resource | Appoint resource files in the `Resource` if
parameters refer to them.
|
-| Custom parameter | It is a local user-defined parameter for Spark,
and will replace the content with `${variable}` in the script.
|
-| Predecessor task | Selecting a predecessor task for the current
task, will set the selected predecessor task as upstream of the current task.
|
+| **Parameter** |
**Description**
|
+|----------------------------|------------------------------------------------------------------------------------------------------------------------------------|
+| Program type | Supports Java, Scala, Python, and SQL.
|
+| The class of main function | The **full path** of Main Class, the entry
point of the Spark program.
|
+| Main jar package | The Spark jar package (upload by Resource
Center).
|
+| SQL scripts | SQL statements in .sql files that Spark sql
runs.
|
+| Deployment mode | <ul><li>spark submit supports three modes:
cluster, client and local.</li><li>spark sql supports client and local
modes.</li></ul> |
+| Namespace (cluster) | Select the namespace, submit application to
native kubernetes, instead, submit to yarn cluster (default).
|
+| Task name | Spark task name.
|
+| Driver core number | Set the number of Driver core, which can be set
according to the actual production environment.
|
+| Driver memory size | Set the size of Driver memories, which can be
set according to the actual production environment.
|
+| Number of Executor | Set the number of Executor, which can be set
according to the actual production environment.
|
+| Executor memory size | Set the size of Executor memories, which can be
set according to the actual production environment.
|
+| Main program parameters | Set the input parameters of the Spark program
and support the substitution of custom parameter variables.
|
+| Optional parameters | Support `--jars`, `--files`,` --archives`,
`--conf` format.
|
+| Resource | Appoint resource files in the `Resource` if
parameters refer to them.
|
+| Custom parameter | It is a local user-defined parameter for Spark,
and will replace the content with `${variable}` in the script.
|
+| Predecessor task | Selecting a predecessor task for the current
task, will set the selected predecessor task as upstream of the current task.
|
## Task Example
diff --git a/docs/docs/zh/guide/task/spark.md b/docs/docs/zh/guide/task/spark.md
index 640d72b062..b09d3241a5 100644
--- a/docs/docs/zh/guide/task/spark.md
+++ b/docs/docs/zh/guide/task/spark.md
@@ -24,8 +24,9 @@ Spark 任务类型用于执行 Spark 应用。对于 Spark 节点,worker 支
- 主函数的 Class:Spark 程序的入口 Main class 的全路径。
- 主程序包:执行 Spark 程序的 jar 包(通过资源中心上传)。
- SQL脚本:Spark sql 运行的 .sql 文件中的 SQL 语句。
-- 部署方式:(1) spark submit 支持 yarn-clusetr、yarn-client 和 local 三种模式。
- (2) spark sql 支持 yarn-client 和 local 两种模式。
+- 部署方式:(1) spark submit 支持 cluster、client 和 local 三种模式。
+ (2) spark sql 支持 client 和 local 两种模式。
+- 命名空间(集群):若选择命名空间(集群),则以原生的方式提交至所选择 K8S 集群执行,未选择则提交至 Yarn 集群执行(默认)。
- 任务名称(可选):Spark 程序的名称。
- Driver 核心数:用于设置 Driver 内核数,可根据实际生产环境设置对应的核心数。
- Driver 内存数:用于设置 Driver 内存数,可根据实际生产环境设置对应的内存数。