This is an automated email from the ASF dual-hosted git repository.
ic4y pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git
The following commit(s) were added to refs/heads/dev by this push:
new 4d31253993 [feature][doc] update doc for jdbc-connector (#5765)
4d31253993 is described below
commit 4d31253993e4846d12b938f16c8eeabe8b4dd3b4
Author: 老王 <[email protected]>
AuthorDate: Thu Jan 4 13:48:35 2024 +0800
[feature][doc] update doc for jdbc-connector (#5765)
---
docs/en/connector-v2/sink/Jdbc.md | 105 ++++++++++++++++++++++++++++---------
docs/en/connector-v2/sink/Mysql.md | 51 ++++++++++--------
2 files changed, 109 insertions(+), 47 deletions(-)
diff --git a/docs/en/connector-v2/sink/Jdbc.md
b/docs/en/connector-v2/sink/Jdbc.md
index 3b1ae9c0eb..a9021b5f5e 100644
--- a/docs/en/connector-v2/sink/Jdbc.md
+++ b/docs/en/connector-v2/sink/Jdbc.md
@@ -26,30 +26,33 @@ support `Xa transactions`. You can set
`is_exactly_once=true` to enable it.
## Options
-| name | type | required | default
value |
-|-------------------------------------------|---------|----------|---------------|
-| url | String | Yes | -
|
-| driver | String | Yes | -
|
-| user | String | No | -
|
-| password | String | No | -
|
-| query | String | No | -
|
-| compatible_mode | String | No | -
|
-| database | String | No | -
|
-| table | String | No | -
|
-| primary_keys | Array | No | -
|
-| support_upsert_by_query_primary_key_exist | Boolean | No | false
|
-| connection_check_timeout_sec | Int | No | 30
|
-| max_retries | Int | No | 0
|
-| batch_size | Int | No | 1000
|
-| is_exactly_once | Boolean | No | false
|
-| generate_sink_sql | Boolean | No | false
|
-| xa_data_source_class_name | String | No | -
|
-| max_commit_attempts | Int | No | 3
|
-| transaction_timeout_sec | Int | No | -1
|
-| auto_commit | Boolean | No | true
|
-| field_ide | String | No | -
|
-| properties | Map | No | -
|
-| common-options | | no | -
|
+| name | type | required |
default value |
+|-------------------------------------------|---------|----------|------------------------------|
+| url | String | Yes | -
|
+| driver | String | Yes | -
|
+| user | String | No | -
|
+| password | String | No | -
|
+| query | String | No | -
|
+| compatible_mode | String | No | -
|
+| database | String | No | -
|
+| table | String | No | -
|
+| primary_keys | Array | No | -
|
+| support_upsert_by_query_primary_key_exist | Boolean | No | false
|
+| connection_check_timeout_sec | Int | No | 30
|
+| max_retries | Int | No | 0
|
+| batch_size | Int | No | 1000
|
+| is_exactly_once | Boolean | No | false
|
+| generate_sink_sql | Boolean | No | false
|
+| xa_data_source_class_name | String | No | -
|
+| max_commit_attempts | Int | No | 3
|
+| transaction_timeout_sec | Int | No | -1
|
+| auto_commit | Boolean | No | true
|
+| field_ide | String | No | -
|
+| properties | Map | No | -
|
+| common-options | | no | -
|
+| schema_save_mode | Enum | no |
CREATE_SCHEMA_WHEN_NOT_EXIST |
+| data_save_mode | Enum | no | APPEND_DATA
|
+| custom_sql | String | no | -
|
### driver [string]
@@ -89,6 +92,20 @@ Use `database` and this `table-name` auto-generate sql and
receive upstream inpu
This option is mutually exclusive with `query` and has a higher priority.
+The table parameter can fill in the name of an unwilling table, which will
eventually be used as the table name of the creation table, and supports
variables (`${table_name}`, `${schema_name}`). Replacement rules:
`${schema_name}` will replace the SCHEMA name passed to the target side, and
`${table_name}` will replace the name of the table passed to the table at the
target side.
+
+mysql sink for example:
+1. test_${schema_name}_${table_name}_test
+2. sink_sinktable
+3. ss_${table_name}
+
+pgsql (Oracle Sqlserver ...) Sink for example:
+1. ${schema_name}.${table_name} _test
+2. dbo.tt_${table_name} _sink
+3. public.sink_table
+
+Tip: If the target database has the concept of SCHEMA, the table parameter
must be written as `xxx.xxx`
+
### primary_keys [array]
This option is used to support operations such as `insert`, `delete`, and
`update` when automatically generate sql.
@@ -152,6 +169,27 @@ Additional connection configuration parameters,when
properties and URL have the
Sink plugin common parameters, please refer to [Sink Common
Options](common-options.md) for details
+### schema_save_mode[Enum]
+
+Before the synchronous task is turned on, different treatment schemes are
selected for the existing surface structure of the target side.
+Option introduction:
+`RECREATE_SCHEMA` :Will create when the table does not exist, delete and
rebuild when the table is saved
+`CREATE_SCHEMA_WHEN_NOT_EXIST` :Will Created when the table does not exist,
skipped when the table is saved
+`ERROR_WHEN_SCHEMA_NOT_EXIST` :Error will be reported when the table does not
exist
+
+### data_save_mode[Enum]
+
+Before the synchronous task is turned on, different processing schemes are
selected for data existing data on the target side.
+Option introduction:
+`DROP_DATA`: Preserve database structure and delete data
+`APPEND_DATA`:Preserve database structure, preserve data
+`CUSTOM_PROCESSING`:User defined processing
+`ERROR_WHEN_DATA_EXISTS`:When there is data, an error is reported
+
+### custom_sql[String]
+
+When data_save_mode selects CUSTOM_PROCESSING, you should fill in the
CUSTOM_SQL parameter. This parameter usually fills in a SQL that can be
executed. SQL will be executed before synchronization tasks.
+
## tips
In the case of is_exactly_once = "true", Xa transactions are used. This
requires database support, and some databases require some setup :
@@ -235,6 +273,25 @@ sink {
}
```
+Add saveMode function
+
+```
+sink {
+ jdbc {
+ url = "jdbc:mysql://localhost:3306"
+ driver = "com.mysql.cj.jdbc.Driver"
+ user = "root"
+ password = "123456"
+
+ database = "sink_database"
+ table = "sink_table"
+ primary_keys = ["key1", "key2", ...]
+ schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST"
+ data_save_mode="APPEND_DATA"
+ }
+}
+```
+
Postgresql 9.5 version below support CDC(Change data capture) event
```
diff --git a/docs/en/connector-v2/sink/Mysql.md
b/docs/en/connector-v2/sink/Mysql.md
index de5ac92450..e5f750bcba 100644
--- a/docs/en/connector-v2/sink/Mysql.md
+++ b/docs/en/connector-v2/sink/Mysql.md
@@ -58,29 +58,32 @@ semantics (using XA transaction guarantee).
## Sink Options
-| Name | Type | Required | Default |
Description
|
-|-------------------------------------------|---------|----------|---------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| url | String | Yes | - |
The URL of the JDBC connection. Refer to a case:
jdbc:mysql://localhost:3306:3306/test
|
-| driver | String | Yes | - |
The jdbc class name used to connect to the remote data source,<br/> if you use
MySQL the value is `com.mysql.cj.jdbc.Driver`.
|
-| user | String | No | - |
Connection instance user name
|
-| password | String | No | - |
Connection instance password
|
-| query | String | No | - |
Use this sql write upstream input datas to database. e.g `INSERT ...`,`query`
have the higher priority
|
-| database | String | No | - |
Use this `database` and `table-name` auto-generate sql and receive upstream
input datas write to database.<br/>This option is mutually exclusive with
`query` and has a higher priority.
|
-| table | String | No | - |
Use database and this table-name auto-generate sql and receive upstream input
datas write to database.<br/>This option is mutually exclusive with `query` and
has a higher priority.
|
-| primary_keys | Array | No | - |
This option is used to support operations such as `insert`, `delete`, and
`update` when automatically generate sql.
|
-| support_upsert_by_query_primary_key_exist | Boolean | No | false |
Choose to use INSERT sql, UPDATE sql to process update events(INSERT,
UPDATE_AFTER) based on query primary key exists. This configuration is only
used when database unsupport upsert syntax. **Note**: that this method has low
performance |
-| connection_check_timeout_sec | Int | No | 30 |
The time in seconds to wait for the database operation used to validate the
connection to complete.
|
-| max_retries | Int | No | 0 |
The number of retries to submit failed (executeBatch)
|
-| batch_size | Int | No | 1000 |
For batch writing, when the number of buffered records reaches the number of
`batch_size` or the time reaches `checkpoint.interval`<br/>, the data will be
flushed into the database
|
-| is_exactly_once | Boolean | No | false |
Whether to enable exactly-once semantics, which will use Xa transactions. If
on, you need to<br/>set `xa_data_source_class_name`.
|
-| generate_sink_sql | Boolean | No | false |
Generate sql statements based on the database table you want to write to
|
-| xa_data_source_class_name | String | No | - |
The xa data source class name of the database Driver, for example, mysql is
`com.mysql.cj.jdbc.MysqlXADataSource`, and<br/>please refer to appendix for
other data sources
|
-| max_commit_attempts | Int | No | 3 |
The number of retries for transaction commit failures
|
-| transaction_timeout_sec | Int | No | -1 |
The timeout after the transaction is opened, the default is -1 (never timeout).
Note that setting the timeout may affect<br/>exactly-once semantics
|
-| auto_commit | Boolean | No | true |
Automatic transaction commit is enabled by default
|
-| field_ide | String | No | - |
Identify whether the field needs to be converted when synchronizing from the
source to the sink. `ORIGINAL` indicates no conversion is needed;`UPPERCASE`
indicates conversion to uppercase;`LOWERCASE` indicates conversion to
lowercase. |
-| properties | Map | No | - |
Additional connection configuration parameters,when properties and URL have the
same parameters, the priority is determined by the <br/>specific implementation
of the driver. For example, in MySQL, properties take precedence over the URL. |
-| common-options | | no | - |
Sink plugin common parameters, please refer to [Sink Common
Options](common-options.md) for details
|
+| Name | Type | Required |
Default |
Description
|
+|-------------------------------------------|---------|----------|------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| url | String | Yes | -
| The URL of the JDBC connection. Refer to a case:
jdbc:mysql://localhost:3306:3306/test
|
+| driver | String | Yes | -
| The jdbc class name used to connect to the remote data
source,<br/> if you use MySQL the value is `com.mysql.cj.jdbc.Driver`.
|
+| user | String | No | -
| Connection instance user name
|
+| password | String | No | -
| Connection instance password
|
+| query | String | No | -
| Use this sql write upstream input datas to database. e.g
`INSERT ...`,`query` have the higher priority
|
+| database | String | No | -
| Use this `database` and `table-name` auto-generate sql and
receive upstream input datas write to database.<br/>This option is mutually
exclusive with `query` and has a higher priority.
|
+| table | String | No | -
| Use database and this table-name auto-generate sql and
receive upstream input datas write to database.<br/>This option is mutually
exclusive with `query` and has a higher priority.
|
+| primary_keys | Array | No | -
| This option is used to support operations such as `insert`,
`delete`, and `update` when automatically generate sql.
|
+| support_upsert_by_query_primary_key_exist | Boolean | No | false
| Choose to use INSERT sql, UPDATE sql to process update
events(INSERT, UPDATE_AFTER) based on query primary key exists. This
configuration is only used when database unsupport upsert syntax. **Note**:
that this method has low performance |
+| connection_check_timeout_sec | Int | No | 30
| The time in seconds to wait for the database operation used
to validate the connection to complete.
|
+| max_retries | Int | No | 0
| The number of retries to submit failed (executeBatch)
|
+| batch_size | Int | No | 1000
| For batch writing, when the number of buffered records
reaches the number of `batch_size` or the time reaches
`checkpoint.interval`<br/>, the data will be flushed into the database
|
+| is_exactly_once | Boolean | No | false
| Whether to enable exactly-once semantics, which will use Xa
transactions. If on, you need to<br/>set `xa_data_source_class_name`.
|
+| generate_sink_sql | Boolean | No | false
| Generate sql statements based on the database table you want
to write to
|
+| xa_data_source_class_name | String | No | -
| The xa data source class name of the database Driver, for
example, mysql is `com.mysql.cj.jdbc.MysqlXADataSource`, and<br/>please refer
to appendix for other data sources
|
+| max_commit_attempts | Int | No | 3
| The number of retries for transaction commit failures
|
+| transaction_timeout_sec | Int | No | -1
| The timeout after the transaction is opened, the default is
-1 (never timeout). Note that setting the timeout may affect<br/>exactly-once
semantics
|
+| auto_commit | Boolean | No | true
| Automatic transaction commit is enabled by default
|
+| field_ide | String | No | -
| Identify whether the field needs to be converted when
synchronizing from the source to the sink. `ORIGINAL` indicates no conversion
is needed;`UPPERCASE` indicates conversion to uppercase;`LOWERCASE` indicates
conversion to lowercase. |
+| properties | Map | No | -
| Additional connection configuration parameters,when
properties and URL have the same parameters, the priority is determined by the
<br/>specific implementation of the driver. For example, in MySQL, properties
take precedence over the URL. |
+| common-options | | no | -
| Sink plugin common parameters, please refer to [Sink Common
Options](common-options.md) for details
|
+| schema_save_mode | Enum | no |
CREATE_SCHEMA_WHEN_NOT_EXIST | Before the synchronous task is turned on,
different treatment schemes are selected for the existing surface structure of
the target side.
|
+| data_save_mode | Enum | no | APPEND_DATA
| Before the synchronous task is turned on, different
processing schemes are selected for data existing data on the target side.
|
+| custom_sql | String | no | -
| When data_save_mode selects CUSTOM_PROCESSING, you should
fill in the CUSTOM_SQL parameter. This parameter usually fills in a SQL that
can be executed. SQL will be executed before synchronization tasks.
|
### Tips
@@ -193,6 +196,8 @@ sink {
table = sink_table
primary_keys = ["id","name"]
field_ide = UPPERCASE
+ schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST"
+ data_save_mode="APPEND_DATA"
}
}
```