This is an automated email from the ASF dual-hosted git repository.
HTHou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iotdb-docs.git
The following commit(s) were added to refs/heads/main by this push:
new b878e83d update spark in tree mode (#1135)
b878e83d is described below
commit b878e83de5253efa2374a356b21714b72cf5c023
Author: leto-b <[email protected]>
AuthorDate: Wed May 27 14:22:15 2026 +0800
update spark in tree mode (#1135)
---
.../Tree/Ecosystem-Integration/Spark-IoTDB.md | 174 ++++++++++-----------
.../latest/Ecosystem-Integration/Spark-IoTDB.md | 174 ++++++++++-----------
.../Tree/Ecosystem-Integration/Spark-IoTDB.md | 149 +++++++++---------
.../latest/Ecosystem-Integration/Spark-IoTDB.md | 149 +++++++++---------
4 files changed, 310 insertions(+), 336 deletions(-)
diff --git a/src/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
b/src/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
index 1c43c3b7..f3eb8dd1 100644
--- a/src/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
+++ b/src/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
@@ -21,29 +21,25 @@
# Apache Spark
-## 1. Supported Versions
+## 1. Overview
+IoTDB provides the `Spark-IoTDB-Connector`, a Spark connector for IoTDB's tree
model, which supports reading and writing data from/to IoTDB's tree model in
Spark environments.
-Supported versions of Spark and Scala are as follows:
+## 2. Compatibility Requirements
+| Software | Version |
+|----------|---------|
+| `Spark` | 2.4.0-latest |
+| `Scala` | 2.11, 2.12 |
-| Spark Version | Scala Version |
-|----------------|---------------|
-| `2.4.0-latest` | `2.11, 2.12` |
+* The `spark-iotdb-connector` is compatible with Java, Scala-based Spark, and
PySpark.
-## 2. Precautions
-
-1. The current version of `spark-iotdb-connector` supports Scala `2.11` and
`2.12`, but not `2.13`.
-2. `spark-iotdb-connector` supports usage in Spark for both Java, Scala, and
PySpark.
-
-## 3. Deployment
-
-`spark-iotdb-connector` has two use cases: IDE development and `spark-shell`
debugging.
+## 3. Deployment Methods
+There are two usage scenarios for the `spark-iotdb-connector`: IDE development
and `spark-shell` debugging.
### 3.1 IDE Development
+For IDE development, simply add the following dependency to your `pom.xml`
file.
-For IDE development, simply add the following dependency to the `pom.xml` file:
-
-``` xml
- <dependency>
+```XML
+<dependency>
<groupId>org.apache.iotdb</groupId>
<!-- spark-iotdb-connector_2.11 or spark-iotdb-connector_2.13 -->
<artifactId>spark-iotdb-connector_2.12.10</artifactId>
@@ -52,52 +48,49 @@ For IDE development, simply add the following dependency to
the `pom.xml` file:
```
### 3.2 `spark-shell` Debugging
+To use the `spark-iotdb-connector` in `spark-shell`, follow these steps:
-To use `spark-iotdb-connector` in `spark-shell`, you need to download the
`with-dependencies` version of the jar package
-from the official website. After that, copy the jar package to the
`${SPARK_HOME}/jars` directory.
-Simply execute the following command:
+* Download the `with-dependencies` JAR package from the official website
+* Copy the JAR package to the `${SPARK_HOME}/jars` directory using the
following command:
-```shell
+```Bash
cp spark-iotdb-connector_2.12.10-${iotdb.version}.jar $SPARK_HOME/jars/
```
-In addition, to ensure that spark can use JDBC and IoTDB connections, you need
to do the following:
+To ensure Spark can connect to IoTDB via JDBC, perform the following steps:
-Run the following command to compile the IoTDB JDBC connector:
+* Compile the IoTDB-JDBC connector by running:
-```shell
+```Bash
mvn clean package -pl iotdb-client/jdbc -am -DskipTests -P
get-jar-with-dependencies
```
-The compiled jar package is located in the following directory:
+* The compiled JAR package will be located in the following directory:
-```shell
+```Bash
$IoTDB_HOME/iotdb-client/jdbc/target/iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar
```
-At last, copy the jar package to the ${SPARK_HOME}/jars directory. Simply
execute the following command:
+* Copy the JAR package to the `${SPARK_HOME}/jars` directory using the
following command:
-```shell
+```Bash
cp iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar $SPARK_HOME/jars/
```
## 4. Usage
-
-### 4.1Parameters
-
-| Parameter | Description
| Default Value | Scope |
Can be Empty |
-|--------------|--------------------------------------------------------------------------------------------------------------|---------------|-------------|--------------|
-| url | Specifies the JDBC URL of IoTDB
| null | read, write |
false |
-| user | The username of IoTDB
| root | read, write |
true |
-| password | The password of IoTDB
| root | read, write |
true |
-| sql | Specifies the SQL statement for querying
| null | read |
true |
-| numPartition | Specifies the partition number of the DataFrame when in read,
and the write concurrency number when in write | 1 | read, write |
true |
-| lowerBound | The start timestamp of the query (inclusive)
| 0 | read |
true |
-| upperBound | The end timestamp of the query (inclusive)
| 0 | read |
true |
-
-### 4.2 Reading Data from IoTDB
-
-Here is an example that demonstrates how to read data from IoTDB into a
DataFrame:
+### 4.1 Parameter Description
+| **Parameter** | **Description** | **Default Value** | **Usage Scope** |
**Nullable** |
+|---------------|-----------------|-------------------|-----------------|--------------|
+| url | Specifies the JDBC URL of IoTDB | null | read, write | FALSE |
+| user | IoTDB username | root | read, write | TRUE |
+| password | IoTDB password | root | read, write | TRUE |
+| sql | Specifies the SQL query statement | null | read | TRUE |
+| numPartition | Specifies the number of DataFrame partitions for read
operations, and the write concurrency for write operations | 1 | read, write |
TRUE |
+| lowerBound | Query start timestamp (inclusive) | 0 | read | TRUE |
+| upperBound | Query end timestamp (inclusive) | 0 | read | TRUE |
+
+### 4.2 Reading Data
+* Read data from IoTDB into a DataFrame
```scala
import org.apache.iotdb.spark.db._
@@ -106,10 +99,10 @@ val df = spark.read.format("org.apache.iotdb.spark.db")
.option("user", "root")
.option("password", "root")
.option("url", "jdbc:iotdb://127.0.0.1:6667/")
- .option("sql", "select ** from root") // query SQL
- .option("lowerBound", "0") // lower timestamp bound
- .option("upperBound", "100000000") // upper timestamp bound
- .option("numPartition", "5") // number of partitions
+ .option("sql", "select ** from root") // Query SQL
+ .option("lowerBound", "0") // Timestamp lower bound
+ .option("upperBound", "100000000") // Timestamp upper bound
+ .option("numPartition", "5") // Number of partitions
.load
df.printSchema()
@@ -117,10 +110,7 @@ df.printSchema()
df.show()
```
-### 4.3 Writing Data to IoTDB
-
-Here is an example that demonstrates how to write data to IoTDB:
-
+### 4.3 Writing Data
```scala
// Construct narrow table data
val df = spark.createDataFrame(List(
@@ -163,43 +153,21 @@ dfWithColumn.write.format("org.apache.iotdb.spark.db")
.save
```
-### 4.4 Wide and Narrow Table Conversion
-
-Here are examples of how to convert between wide and narrow tables:
-
-* From wide to narrow
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
-val narrow_df = Transformer.toNarrowForm(spark, wide_df)
-```
-
-* From narrow to wide
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = Transformer.toWideForm(spark, narrow_df)
-```
-
-## 5. Wide and Narrow Tables
-
-Using the TsFile structure as an example: there are three measurements in the
TsFile pattern,
-namely `Status`, `Temperature`, and `Hardware`. The basic information for each
of these three measurements is as
-follows:
+## 5. Wide Table vs Narrow Table
+### 5.1 Data Format Example
+Taking the TsFile structure as an example, assume there are three measurements
in the TsFile schema: status, temperature, and hardware.
-| Name | Type | Encoding |
-|-------------|---------|----------|
-| Status | Boolean | PLAIN |
-| Temperature | Float | RLE |
-| Hardware | Text | PLAIN |
+* Basic information:
-The existing data in the TsFile is as follows:
+| Name | Type | Encoding |
+|------|------|----------|
+| status | Boolean | PLAIN |
+| temperature | Float | RLE |
+| hardware | Text | PLAIN |
-* `d1:root.ln.wf01.wt01`
-* `d2:root.ln.wf02.wt02`
+* Data:
+ * `d1:root.ln.wf01.wt01`
+ * `d2:root.ln.wf02.wt02`
| time | d1.status | time | d1.temperature | time | d2.hardware | time |
d2.status |
|------|-----------|------|----------------|------|-------------|------|-----------|
@@ -207,7 +175,7 @@ The existing data in the TsFile is as follows:
| 3 | True | 2 | 2.2 | 4 | "bbb" | 2 | False
|
| 5 | False | 3 | 2.1 | 6 | "ccc" | 4 | True
|
-The wide (default) table form is as follows:
+* Wide table (default) format:
| Time | root.ln.wf02.wt02.temperature | root.ln.wf02.wt02.status |
root.ln.wf02.wt02.hardware | root.ln.wf01.wt01.temperature |
root.ln.wf01.wt01.status | root.ln.wf01.wt01.hardware |
|------|-------------------------------|--------------------------|----------------------------|-------------------------------|--------------------------|----------------------------|
@@ -218,15 +186,35 @@ The wide (default) table form is as follows:
| 5 | null | null | null
| null | false |
null |
| 6 | null | null | ccc
| null | null |
null |
-You can also use the narrow table format as shown below:
+* Narrow table format:
| Time | Device | status | hardware | temperature |
|------|-------------------|--------|----------|-------------|
-| 1 | root.ln.wf02.wt01 | true | null | 2.2 |
+| 1 | root.ln.wf01.wt01 | true | null | 2.2 |
| 1 | root.ln.wf02.wt02 | true | null | null |
-| 2 | root.ln.wf02.wt01 | null | null | 2.2 |
+| 2 | root.ln.wf01.wt01 | null | null | 2.2 |
| 2 | root.ln.wf02.wt02 | false | aaa | null |
-| 3 | root.ln.wf02.wt01 | true | null | 2.1 |
+| 3 | root.ln.wf01.wt01 | true | null | 2.1 |
| 4 | root.ln.wf02.wt02 | true | bbb | null |
-| 5 | root.ln.wf02.wt01 | false | null | null |
-| 6 | root.ln.wf02.wt02 | null | ccc | null |
\ No newline at end of file
+| 5 | root.ln.wf01.wt01 | false | null | null |
+| 6 | root.ln.wf02.wt02 | null | ccc | null |
+
+> Note: Corrected the device path typo in the original narrow table example
(from `root.ln.wf02.wt01` to `root.ln.wf01.wt01`) to match the data definition.
+
+### 5.2 Data Conversion Example
+* Convert from wide table to narrow table
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
+val narrow_df = Transformer.toNarrowForm(spark, wide_df)
+```
+
+* Convert from narrow table to wide table
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = Transformer.toWideForm(spark, narrow_df)
+```
diff --git a/src/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
b/src/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
index 1c43c3b7..f3eb8dd1 100644
--- a/src/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
+++ b/src/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
@@ -21,29 +21,25 @@
# Apache Spark
-## 1. Supported Versions
+## 1. Overview
+IoTDB provides the `Spark-IoTDB-Connector`, a Spark connector for IoTDB's tree
model, which supports reading and writing data from/to IoTDB's tree model in
Spark environments.
-Supported versions of Spark and Scala are as follows:
+## 2. Compatibility Requirements
+| Software | Version |
+|----------|---------|
+| `Spark` | 2.4.0-latest |
+| `Scala` | 2.11, 2.12 |
-| Spark Version | Scala Version |
-|----------------|---------------|
-| `2.4.0-latest` | `2.11, 2.12` |
+* The `spark-iotdb-connector` is compatible with Java, Scala-based Spark, and
PySpark.
-## 2. Precautions
-
-1. The current version of `spark-iotdb-connector` supports Scala `2.11` and
`2.12`, but not `2.13`.
-2. `spark-iotdb-connector` supports usage in Spark for both Java, Scala, and
PySpark.
-
-## 3. Deployment
-
-`spark-iotdb-connector` has two use cases: IDE development and `spark-shell`
debugging.
+## 3. Deployment Methods
+There are two usage scenarios for the `spark-iotdb-connector`: IDE development
and `spark-shell` debugging.
### 3.1 IDE Development
+For IDE development, simply add the following dependency to your `pom.xml`
file.
-For IDE development, simply add the following dependency to the `pom.xml` file:
-
-``` xml
- <dependency>
+```XML
+<dependency>
<groupId>org.apache.iotdb</groupId>
<!-- spark-iotdb-connector_2.11 or spark-iotdb-connector_2.13 -->
<artifactId>spark-iotdb-connector_2.12.10</artifactId>
@@ -52,52 +48,49 @@ For IDE development, simply add the following dependency to
the `pom.xml` file:
```
### 3.2 `spark-shell` Debugging
+To use the `spark-iotdb-connector` in `spark-shell`, follow these steps:
-To use `spark-iotdb-connector` in `spark-shell`, you need to download the
`with-dependencies` version of the jar package
-from the official website. After that, copy the jar package to the
`${SPARK_HOME}/jars` directory.
-Simply execute the following command:
+* Download the `with-dependencies` JAR package from the official website
+* Copy the JAR package to the `${SPARK_HOME}/jars` directory using the
following command:
-```shell
+```Bash
cp spark-iotdb-connector_2.12.10-${iotdb.version}.jar $SPARK_HOME/jars/
```
-In addition, to ensure that spark can use JDBC and IoTDB connections, you need
to do the following:
+To ensure Spark can connect to IoTDB via JDBC, perform the following steps:
-Run the following command to compile the IoTDB JDBC connector:
+* Compile the IoTDB-JDBC connector by running:
-```shell
+```Bash
mvn clean package -pl iotdb-client/jdbc -am -DskipTests -P
get-jar-with-dependencies
```
-The compiled jar package is located in the following directory:
+* The compiled JAR package will be located in the following directory:
-```shell
+```Bash
$IoTDB_HOME/iotdb-client/jdbc/target/iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar
```
-At last, copy the jar package to the ${SPARK_HOME}/jars directory. Simply
execute the following command:
+* Copy the JAR package to the `${SPARK_HOME}/jars` directory using the
following command:
-```shell
+```Bash
cp iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar $SPARK_HOME/jars/
```
## 4. Usage
-
-### 4.1Parameters
-
-| Parameter | Description
| Default Value | Scope |
Can be Empty |
-|--------------|--------------------------------------------------------------------------------------------------------------|---------------|-------------|--------------|
-| url | Specifies the JDBC URL of IoTDB
| null | read, write |
false |
-| user | The username of IoTDB
| root | read, write |
true |
-| password | The password of IoTDB
| root | read, write |
true |
-| sql | Specifies the SQL statement for querying
| null | read |
true |
-| numPartition | Specifies the partition number of the DataFrame when in read,
and the write concurrency number when in write | 1 | read, write |
true |
-| lowerBound | The start timestamp of the query (inclusive)
| 0 | read |
true |
-| upperBound | The end timestamp of the query (inclusive)
| 0 | read |
true |
-
-### 4.2 Reading Data from IoTDB
-
-Here is an example that demonstrates how to read data from IoTDB into a
DataFrame:
+### 4.1 Parameter Description
+| **Parameter** | **Description** | **Default Value** | **Usage Scope** |
**Nullable** |
+|---------------|-----------------|-------------------|-----------------|--------------|
+| url | Specifies the JDBC URL of IoTDB | null | read, write | FALSE |
+| user | IoTDB username | root | read, write | TRUE |
+| password | IoTDB password | root | read, write | TRUE |
+| sql | Specifies the SQL query statement | null | read | TRUE |
+| numPartition | Specifies the number of DataFrame partitions for read
operations, and the write concurrency for write operations | 1 | read, write |
TRUE |
+| lowerBound | Query start timestamp (inclusive) | 0 | read | TRUE |
+| upperBound | Query end timestamp (inclusive) | 0 | read | TRUE |
+
+### 4.2 Reading Data
+* Read data from IoTDB into a DataFrame
```scala
import org.apache.iotdb.spark.db._
@@ -106,10 +99,10 @@ val df = spark.read.format("org.apache.iotdb.spark.db")
.option("user", "root")
.option("password", "root")
.option("url", "jdbc:iotdb://127.0.0.1:6667/")
- .option("sql", "select ** from root") // query SQL
- .option("lowerBound", "0") // lower timestamp bound
- .option("upperBound", "100000000") // upper timestamp bound
- .option("numPartition", "5") // number of partitions
+ .option("sql", "select ** from root") // Query SQL
+ .option("lowerBound", "0") // Timestamp lower bound
+ .option("upperBound", "100000000") // Timestamp upper bound
+ .option("numPartition", "5") // Number of partitions
.load
df.printSchema()
@@ -117,10 +110,7 @@ df.printSchema()
df.show()
```
-### 4.3 Writing Data to IoTDB
-
-Here is an example that demonstrates how to write data to IoTDB:
-
+### 4.3 Writing Data
```scala
// Construct narrow table data
val df = spark.createDataFrame(List(
@@ -163,43 +153,21 @@ dfWithColumn.write.format("org.apache.iotdb.spark.db")
.save
```
-### 4.4 Wide and Narrow Table Conversion
-
-Here are examples of how to convert between wide and narrow tables:
-
-* From wide to narrow
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
-val narrow_df = Transformer.toNarrowForm(spark, wide_df)
-```
-
-* From narrow to wide
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = Transformer.toWideForm(spark, narrow_df)
-```
-
-## 5. Wide and Narrow Tables
-
-Using the TsFile structure as an example: there are three measurements in the
TsFile pattern,
-namely `Status`, `Temperature`, and `Hardware`. The basic information for each
of these three measurements is as
-follows:
+## 5. Wide Table vs Narrow Table
+### 5.1 Data Format Example
+Taking the TsFile structure as an example, assume there are three measurements
in the TsFile schema: status, temperature, and hardware.
-| Name | Type | Encoding |
-|-------------|---------|----------|
-| Status | Boolean | PLAIN |
-| Temperature | Float | RLE |
-| Hardware | Text | PLAIN |
+* Basic information:
-The existing data in the TsFile is as follows:
+| Name | Type | Encoding |
+|------|------|----------|
+| status | Boolean | PLAIN |
+| temperature | Float | RLE |
+| hardware | Text | PLAIN |
-* `d1:root.ln.wf01.wt01`
-* `d2:root.ln.wf02.wt02`
+* Data:
+ * `d1:root.ln.wf01.wt01`
+ * `d2:root.ln.wf02.wt02`
| time | d1.status | time | d1.temperature | time | d2.hardware | time |
d2.status |
|------|-----------|------|----------------|------|-------------|------|-----------|
@@ -207,7 +175,7 @@ The existing data in the TsFile is as follows:
| 3 | True | 2 | 2.2 | 4 | "bbb" | 2 | False
|
| 5 | False | 3 | 2.1 | 6 | "ccc" | 4 | True
|
-The wide (default) table form is as follows:
+* Wide table (default) format:
| Time | root.ln.wf02.wt02.temperature | root.ln.wf02.wt02.status |
root.ln.wf02.wt02.hardware | root.ln.wf01.wt01.temperature |
root.ln.wf01.wt01.status | root.ln.wf01.wt01.hardware |
|------|-------------------------------|--------------------------|----------------------------|-------------------------------|--------------------------|----------------------------|
@@ -218,15 +186,35 @@ The wide (default) table form is as follows:
| 5 | null | null | null
| null | false |
null |
| 6 | null | null | ccc
| null | null |
null |
-You can also use the narrow table format as shown below:
+* Narrow table format:
| Time | Device | status | hardware | temperature |
|------|-------------------|--------|----------|-------------|
-| 1 | root.ln.wf02.wt01 | true | null | 2.2 |
+| 1 | root.ln.wf01.wt01 | true | null | 2.2 |
| 1 | root.ln.wf02.wt02 | true | null | null |
-| 2 | root.ln.wf02.wt01 | null | null | 2.2 |
+| 2 | root.ln.wf01.wt01 | null | null | 2.2 |
| 2 | root.ln.wf02.wt02 | false | aaa | null |
-| 3 | root.ln.wf02.wt01 | true | null | 2.1 |
+| 3 | root.ln.wf01.wt01 | true | null | 2.1 |
| 4 | root.ln.wf02.wt02 | true | bbb | null |
-| 5 | root.ln.wf02.wt01 | false | null | null |
-| 6 | root.ln.wf02.wt02 | null | ccc | null |
\ No newline at end of file
+| 5 | root.ln.wf01.wt01 | false | null | null |
+| 6 | root.ln.wf02.wt02 | null | ccc | null |
+
+> Note: Corrected the device path typo in the original narrow table example
(from `root.ln.wf02.wt01` to `root.ln.wf01.wt01`) to match the data definition.
+
+### 5.2 Data Conversion Example
+* Convert from wide table to narrow table
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
+val narrow_df = Transformer.toNarrowForm(spark, wide_df)
+```
+
+* Convert from narrow table to wide table
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = Transformer.toWideForm(spark, narrow_df)
+```
diff --git a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
b/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
index b852cbcf..f80c4b2c 100644
--- a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
+++ b/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Spark-IoTDB.md
@@ -21,29 +21,29 @@
# Apache Spark
-## 1. 版本支持
+## 1. 功能概述
-支持的 Spark 与 Scala 版本如下:
+IoTDB 提供 `Spark-IoTDB-Connector` 作为实现 IoTDB 树模型的 Spark 连接器,支持在 Spark 环境中对
IoTDB 树模型的数据进行读写。
-| Spark 版本 | Scala 版本 |
-|----------------|--------------|
-| `2.4.0-latest` | `2.11, 2.12` |
+## 2. 兼容性要求
-## 2. 注意事项
+| 软件 | 版本 |
+| ------------- | -------------- |
+| `Spark` | 2.4.0-latest |
+| `Scala` | 2.11, 2.12 |
-1. 当前版本的 `spark-iotdb-connector` 支持 `2.11` 与 `2.12` 两个版本的 Scala,暂不支持 `2.13` 版本。
-2. `spark-iotdb-connector` 支持在 Java、Scala 版本的 Spark 与 PySpark 中使用。
+* `spark-iotdb-connector` 支持在 Java、Scala 版本的 Spark 与 PySpark 中使用。
-## 3. 部署
+## 3. 部署方式
-`spark-iotdb-connector` 总共有两个使用场景,分别为 IDE 开发与 spark-shell 调试。
+`spark-iotdb-connector` 共有两个使用场景,分别为 IDE 开发与 spark-shell 调试。
### 3.1 IDE 开发
-在 IDE 开发时,只需要在 `pom.xml` 文件中添加以下依赖即可:
+在 IDE 开发时,只需要在 `pom.xml` 文件中添加以下依赖即可。
-``` xml
- <dependency>
+```XML
+<dependency>
<groupId>org.apache.iotdb</groupId>
<!-- spark-iotdb-connector_2.11 or spark-iotdb-connector_2.13 -->
<artifactId>spark-iotdb-connector_2.12.10</artifactId>
@@ -53,50 +53,51 @@
### 3.2 `spark-shell` 调试
-如果需要在 `spark-shell` 中使用 `spark-iotdb-connetcor`,需要先在官网下载 `with-dependencies`
版本的 jar 包。然后再将 Jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可。
-执行以下命令即可:
+在 `spark-shell` 中使用 `spark-iotdb-connetcor`,可参考如下步骤:
-```shell
+* 通过官网下载 `with-dependencies` 版本的 jar 包
+* 通过如下命令将 Jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可。
+
+```Bash
cp spark-iotdb-connector_2.12.10-${iotdb.version}.jar $SPARK_HOME/jars/
```
-此外,为了保证 spark 能使用 JDBC 和 IoTDB 连接,需要进行如下操作:
+为了保证 spark 能使用 JDBC 和 IoTDB 连接,需要进行如下操作:
-运行如下命令来编译 IoTDB-JDBC 连接器:
+* 运行如下命令来编译 IoTDB-JDBC 连接器
-```shell
+```Bash
mvn clean package -pl iotdb-client/jdbc -am -DskipTests -P
get-jar-with-dependencies
```
-编译后的 jar 包在如下目录中:
+* 编译后的 jar 包在如下目录中
-```shell
+```Bash
$IoTDB_HOME/iotdb-client/jdbc/target/iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar
```
-最后再将 jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可。执行以下命令即可:
+* 运行如下命令将 jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可
-```shell
+```Bash
cp iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar $SPARK_HOME/jars/
```
-## 4. 使用
-
-### 4.1 参数
+## 4. 使用方式
+### 4.1 参数介绍
-| 参数 | 描述 | 默认值 | 使用范围
| 能否为空 |
-|--------------|------------------------------------------------|------|------------|-------|
-| url | 指定 IoTDB 的 JDBC 的 URL | null |
read、write | false |
-| user | IoTDB 的用户名 | root |
read、write | true |
-| password | IoTDB 的密码 | root |
read、write | true |
-| sql | 用于指定查询的 SQL 语句 | null | read
| true |
-| numPartition | 在 read 中用于指定 DataFrame 的分区数,在 write 中用于设置写入并发数 | 1 |
read、write | true |
-| lowerBound | 查询的起始时间戳(包含) | 0 | read
| true |
-| upperBound | 查询的结束时间戳(包含) | 0 | read
| true |
+| **参数** | **描述** |
**默认值** | **使用范围** | **能否为空** |
+| ---------------- |
---------------------------------------------------------------------- |
------------------ | -------------------- | -------------------- |
+| url | 指定 IoTDB 的 JDBC 的 URL
| null | read、write | FALSE |
+| user | IoTDB 的用户名
| root | read、write | TRUE |
+| password | IoTDB 的密码
| root | read、write | TRUE |
+| sql | 用于指定查询的 SQL 语句
| null | read | TRUE |
+| numPartition | 在 read 中用于指定 DataFrame 的分区数,在 write 中用于设置写入并发数 | 1
| read、write | TRUE |
+| lowerBound | 查询的起始时间戳(包含) |
0 | read | TRUE |
+| upperBound | 查询的结束时间戳(包含) |
0 | read | TRUE |
-### 4.2 从 IoTDB 读取数据
+### 4.2 读取数据
-以下是一个示例,演示如何从 IoTDB 中读取数据成为 DataFrame。
+* 从 IoTDB 中读取数据成为 DataFrame
```scala
import org.apache.iotdb.spark.db._
@@ -116,9 +117,7 @@ df.printSchema()
df.show()
```
-### 4.3 将数据写入 IoTDB
-
-以下是一个示例,演示如何将数据写入 IoTDB。
+### 4.3 写入数据
```scala
// 构造窄表数据
@@ -162,52 +161,33 @@ dfWithColumn.write.format("org.apache.iotdb.spark.db")
.save
```
-### 4.4 宽表与窄表转换
-
-以下是如何转换宽表与窄表的示例:
-
-* 从宽到窄
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
-val narrow_df = Transformer.toNarrowForm(spark, wide_df)
-```
-
-* 从窄到宽
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = Transformer.toWideForm(spark, narrow_df)
-```
-
## 5. 宽表与窄表
+### 5.1 数据格式示例
-以下 TsFile 结构为例:TsFile 模式中有三个度量:状态,温度和硬件。 这三种测量的基本信息如下:
+以 TsFile 结构为例,假设 TsFile 模式中有三个度量:状态,温度和硬件。
-| 名称 | 类型 | 编码 |
-|-----|---------|-------|
-| 状态 | Boolean | PLAIN |
-| 温度 | Float | RLE |
-| 硬件 | Text | PLAIN |
+* 基本信息如下:
-TsFile 中的现有数据如下:
+| 名称 | 类型 | 编码 |
+| ------ | --------- | ------- |
+| 状态 | Boolean | PLAIN |
+| 温度 | Float | RLE |
+| 硬件 | Text | PLAIN |
-* `d1:root.ln.wf01.wt01`
-* `d2:root.ln.wf02.wt02`
+* 数据如下:
+ * `d1:root.ln.wf01.wt01`
+ * `d2:root.ln.wf02.wt02`
| time | d1.status | time | d1.temperature | time | d2.hardware | time |
d2.status |
-|------|-----------|------|----------------|------|-------------|------|-----------|
+| ------ | ----------- | ------ | ---------------- | ------ | ------------- |
------ | ----------- |
| 1 | True | 1 | 2.2 | 2 | "aaa" | 1 | True
|
| 3 | True | 2 | 2.2 | 4 | "bbb" | 2 | False
|
| 5 | False | 3 | 2.1 | 6 | "ccc" | 4 | True
|
-宽(默认)表形式如下:
+* 宽表(默认)形式如下:
| Time | root.ln.wf02.wt02.temperature | root.ln.wf02.wt02.status |
root.ln.wf02.wt02.hardware | root.ln.wf01.wt01.temperature |
root.ln.wf01.wt01.status | root.ln.wf01.wt01.hardware |
-|------|-------------------------------|--------------------------|----------------------------|-------------------------------|--------------------------|----------------------------|
+| ------ | ------------------------------- | -------------------------- |
---------------------------- | ------------------------------- |
-------------------------- | ---------------------------- |
| 1 | null | true | null
| 2.2 | true |
null |
| 2 | null | false | aaa
| 2.2 | null |
null |
| 3 | null | null | null
| 2.1 | true |
null |
@@ -215,10 +195,10 @@ TsFile 中的现有数据如下:
| 5 | null | null | null
| null | false |
null |
| 6 | null | null | ccc
| null | null |
null |
-你还可以使用窄表形式,如下所示:
+* 窄表形式如下:
| Time | Device | status | hardware | temperature |
-|------|-------------------|--------|----------|-------------|
+| ------ | ------------------- | -------- | ---------- | ------------- |
| 1 | root.ln.wf02.wt01 | true | null | 2.2 |
| 1 | root.ln.wf02.wt02 | true | null | null |
| 2 | root.ln.wf02.wt01 | null | null | 2.2 |
@@ -227,3 +207,22 @@ TsFile 中的现有数据如下:
| 4 | root.ln.wf02.wt02 | true | bbb | null |
| 5 | root.ln.wf02.wt01 | false | null | null |
| 6 | root.ln.wf02.wt02 | null | ccc | null |
+
+### 5.2 数据转换示例
+
+* 从宽表到窄表
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
+val narrow_df = Transformer.toNarrowForm(spark, wide_df)
+```
+
+* 从窄表到宽表
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = Transformer.toWideForm(spark, narrow_df)
+```
diff --git a/src/zh/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
b/src/zh/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
index b852cbcf..f80c4b2c 100644
--- a/src/zh/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
+++ b/src/zh/UserGuide/latest/Ecosystem-Integration/Spark-IoTDB.md
@@ -21,29 +21,29 @@
# Apache Spark
-## 1. 版本支持
+## 1. 功能概述
-支持的 Spark 与 Scala 版本如下:
+IoTDB 提供 `Spark-IoTDB-Connector` 作为实现 IoTDB 树模型的 Spark 连接器,支持在 Spark 环境中对
IoTDB 树模型的数据进行读写。
-| Spark 版本 | Scala 版本 |
-|----------------|--------------|
-| `2.4.0-latest` | `2.11, 2.12` |
+## 2. 兼容性要求
-## 2. 注意事项
+| 软件 | 版本 |
+| ------------- | -------------- |
+| `Spark` | 2.4.0-latest |
+| `Scala` | 2.11, 2.12 |
-1. 当前版本的 `spark-iotdb-connector` 支持 `2.11` 与 `2.12` 两个版本的 Scala,暂不支持 `2.13` 版本。
-2. `spark-iotdb-connector` 支持在 Java、Scala 版本的 Spark 与 PySpark 中使用。
+* `spark-iotdb-connector` 支持在 Java、Scala 版本的 Spark 与 PySpark 中使用。
-## 3. 部署
+## 3. 部署方式
-`spark-iotdb-connector` 总共有两个使用场景,分别为 IDE 开发与 spark-shell 调试。
+`spark-iotdb-connector` 共有两个使用场景,分别为 IDE 开发与 spark-shell 调试。
### 3.1 IDE 开发
-在 IDE 开发时,只需要在 `pom.xml` 文件中添加以下依赖即可:
+在 IDE 开发时,只需要在 `pom.xml` 文件中添加以下依赖即可。
-``` xml
- <dependency>
+```XML
+<dependency>
<groupId>org.apache.iotdb</groupId>
<!-- spark-iotdb-connector_2.11 or spark-iotdb-connector_2.13 -->
<artifactId>spark-iotdb-connector_2.12.10</artifactId>
@@ -53,50 +53,51 @@
### 3.2 `spark-shell` 调试
-如果需要在 `spark-shell` 中使用 `spark-iotdb-connetcor`,需要先在官网下载 `with-dependencies`
版本的 jar 包。然后再将 Jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可。
-执行以下命令即可:
+在 `spark-shell` 中使用 `spark-iotdb-connetcor`,可参考如下步骤:
-```shell
+* 通过官网下载 `with-dependencies` 版本的 jar 包
+* 通过如下命令将 Jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可。
+
+```Bash
cp spark-iotdb-connector_2.12.10-${iotdb.version}.jar $SPARK_HOME/jars/
```
-此外,为了保证 spark 能使用 JDBC 和 IoTDB 连接,需要进行如下操作:
+为了保证 spark 能使用 JDBC 和 IoTDB 连接,需要进行如下操作:
-运行如下命令来编译 IoTDB-JDBC 连接器:
+* 运行如下命令来编译 IoTDB-JDBC 连接器
-```shell
+```Bash
mvn clean package -pl iotdb-client/jdbc -am -DskipTests -P
get-jar-with-dependencies
```
-编译后的 jar 包在如下目录中:
+* 编译后的 jar 包在如下目录中
-```shell
+```Bash
$IoTDB_HOME/iotdb-client/jdbc/target/iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar
```
-最后再将 jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可。执行以下命令即可:
+* 运行如下命令将 jar 包拷贝到 `${SPARK_HOME}/jars` 目录中即可
-```shell
+```Bash
cp iotdb-jdbc-{version}-SNAPSHOT-jar-with-dependencies.jar $SPARK_HOME/jars/
```
-## 4. 使用
-
-### 4.1 参数
+## 4. 使用方式
+### 4.1 参数介绍
-| 参数 | 描述 | 默认值 | 使用范围
| 能否为空 |
-|--------------|------------------------------------------------|------|------------|-------|
-| url | 指定 IoTDB 的 JDBC 的 URL | null |
read、write | false |
-| user | IoTDB 的用户名 | root |
read、write | true |
-| password | IoTDB 的密码 | root |
read、write | true |
-| sql | 用于指定查询的 SQL 语句 | null | read
| true |
-| numPartition | 在 read 中用于指定 DataFrame 的分区数,在 write 中用于设置写入并发数 | 1 |
read、write | true |
-| lowerBound | 查询的起始时间戳(包含) | 0 | read
| true |
-| upperBound | 查询的结束时间戳(包含) | 0 | read
| true |
+| **参数** | **描述** |
**默认值** | **使用范围** | **能否为空** |
+| ---------------- |
---------------------------------------------------------------------- |
------------------ | -------------------- | -------------------- |
+| url | 指定 IoTDB 的 JDBC 的 URL
| null | read、write | FALSE |
+| user | IoTDB 的用户名
| root | read、write | TRUE |
+| password | IoTDB 的密码
| root | read、write | TRUE |
+| sql | 用于指定查询的 SQL 语句
| null | read | TRUE |
+| numPartition | 在 read 中用于指定 DataFrame 的分区数,在 write 中用于设置写入并发数 | 1
| read、write | TRUE |
+| lowerBound | 查询的起始时间戳(包含) |
0 | read | TRUE |
+| upperBound | 查询的结束时间戳(包含) |
0 | read | TRUE |
-### 4.2 从 IoTDB 读取数据
+### 4.2 读取数据
-以下是一个示例,演示如何从 IoTDB 中读取数据成为 DataFrame。
+* 从 IoTDB 中读取数据成为 DataFrame
```scala
import org.apache.iotdb.spark.db._
@@ -116,9 +117,7 @@ df.printSchema()
df.show()
```
-### 4.3 将数据写入 IoTDB
-
-以下是一个示例,演示如何将数据写入 IoTDB。
+### 4.3 写入数据
```scala
// 构造窄表数据
@@ -162,52 +161,33 @@ dfWithColumn.write.format("org.apache.iotdb.spark.db")
.save
```
-### 4.4 宽表与窄表转换
-
-以下是如何转换宽表与窄表的示例:
-
-* 从宽到窄
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
-val narrow_df = Transformer.toNarrowForm(spark, wide_df)
-```
-
-* 从窄到宽
-
-```scala
-import org.apache.iotdb.spark.db._
-
-val wide_df = Transformer.toWideForm(spark, narrow_df)
-```
-
## 5. 宽表与窄表
+### 5.1 数据格式示例
-以下 TsFile 结构为例:TsFile 模式中有三个度量:状态,温度和硬件。 这三种测量的基本信息如下:
+以 TsFile 结构为例,假设 TsFile 模式中有三个度量:状态,温度和硬件。
-| 名称 | 类型 | 编码 |
-|-----|---------|-------|
-| 状态 | Boolean | PLAIN |
-| 温度 | Float | RLE |
-| 硬件 | Text | PLAIN |
+* 基本信息如下:
-TsFile 中的现有数据如下:
+| 名称 | 类型 | 编码 |
+| ------ | --------- | ------- |
+| 状态 | Boolean | PLAIN |
+| 温度 | Float | RLE |
+| 硬件 | Text | PLAIN |
-* `d1:root.ln.wf01.wt01`
-* `d2:root.ln.wf02.wt02`
+* 数据如下:
+ * `d1:root.ln.wf01.wt01`
+ * `d2:root.ln.wf02.wt02`
| time | d1.status | time | d1.temperature | time | d2.hardware | time |
d2.status |
-|------|-----------|------|----------------|------|-------------|------|-----------|
+| ------ | ----------- | ------ | ---------------- | ------ | ------------- |
------ | ----------- |
| 1 | True | 1 | 2.2 | 2 | "aaa" | 1 | True
|
| 3 | True | 2 | 2.2 | 4 | "bbb" | 2 | False
|
| 5 | False | 3 | 2.1 | 6 | "ccc" | 4 | True
|
-宽(默认)表形式如下:
+* 宽表(默认)形式如下:
| Time | root.ln.wf02.wt02.temperature | root.ln.wf02.wt02.status |
root.ln.wf02.wt02.hardware | root.ln.wf01.wt01.temperature |
root.ln.wf01.wt01.status | root.ln.wf01.wt01.hardware |
-|------|-------------------------------|--------------------------|----------------------------|-------------------------------|--------------------------|----------------------------|
+| ------ | ------------------------------- | -------------------------- |
---------------------------- | ------------------------------- |
-------------------------- | ---------------------------- |
| 1 | null | true | null
| 2.2 | true |
null |
| 2 | null | false | aaa
| 2.2 | null |
null |
| 3 | null | null | null
| 2.1 | true |
null |
@@ -215,10 +195,10 @@ TsFile 中的现有数据如下:
| 5 | null | null | null
| null | false |
null |
| 6 | null | null | ccc
| null | null |
null |
-你还可以使用窄表形式,如下所示:
+* 窄表形式如下:
| Time | Device | status | hardware | temperature |
-|------|-------------------|--------|----------|-------------|
+| ------ | ------------------- | -------- | ---------- | ------------- |
| 1 | root.ln.wf02.wt01 | true | null | 2.2 |
| 1 | root.ln.wf02.wt02 | true | null | null |
| 2 | root.ln.wf02.wt01 | null | null | 2.2 |
@@ -227,3 +207,22 @@ TsFile 中的现有数据如下:
| 4 | root.ln.wf02.wt02 | true | bbb | null |
| 5 | root.ln.wf02.wt01 | false | null | null |
| 6 | root.ln.wf02.wt02 | null | ccc | null |
+
+### 5.2 数据转换示例
+
+* 从宽表到窄表
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = spark.read.format("org.apache.iotdb.spark.db").option("url",
"jdbc:iotdb://127.0.0.1:6667/").option("sql", "select * from root.** where time
< 1100 and time > 1000").load
+val narrow_df = Transformer.toNarrowForm(spark, wide_df)
+```
+
+* 从窄表到宽表
+
+```scala
+import org.apache.iotdb.spark.db._
+
+val wide_df = Transformer.toWideForm(spark, narrow_df)
+```