This is an automated email from the ASF dual-hosted git repository.

leonard pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink-cdc.git


The following commit(s) were added to refs/heads/master by this push:
     new e73f3adaa [FLINK-34679][cdc][docs] Add  core concept pages for Flink 
CDC docs
e73f3adaa is described below

commit e73f3adaa365c65882fa0863385e90612d44b278
Author: kunni <[email protected]>
AuthorDate: Mon Mar 18 19:44:13 2024 +0800

    [FLINK-34679][cdc][docs] Add  core concept pages for Flink CDC docs
    
    This closes #3153.
---
 docs/content/docs/core-concept/data-pipeline.md | 77 +++++++++++++++++++++++++
 docs/content/docs/core-concept/data-sink.md     | 25 ++++++++
 docs/content/docs/core-concept/data-source.md   | 26 +++++++++
 docs/content/docs/core-concept/route.md         | 49 ++++++++++++++++
 docs/content/docs/core-concept/table-id.md      | 15 +++++
 docs/content/docs/core-concept/transform.md     |  7 +++
 6 files changed, 199 insertions(+)

diff --git a/docs/content/docs/core-concept/data-pipeline.md 
b/docs/content/docs/core-concept/data-pipeline.md
index a1cf1986e..3903c922b 100644
--- a/docs/content/docs/core-concept/data-pipeline.md
+++ b/docs/content/docs/core-concept/data-pipeline.md
@@ -23,3 +23,80 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+# Definition
+Since events in Flink CDC flow from the upstream to the downstream in a 
pipeline manner, the whole ETL task is referred as a **Data Pipeline**.
+
+# Parameters
+A pipeline corresponds to a chain of operators in Flink.   
+To describe a Data Pipeline, the following parts are required:
+- [source]({{< ref "docs/core-concept/data-source" >}})
+- [sink]({{< ref "docs/core-concept/data-sink" >}})
+- [pipeline](#pipeline-configurations)
+
+the following parts are optional:
+- [route]({{< ref "docs/core-concept/route" >}})
+- [transform]({{< ref "docs/core-concept/transform" >}})
+
+# Example
+## Only required
+We could use following yaml file to define a concise Data Pipeline describing 
synchronize all tables under MySQL app_db database to Doris :
+
+```yaml
+   source:
+     type: mysql
+     hostname: localhost
+     port: 3306
+     username: root
+     password: 123456
+     tables: app_db.\.*
+
+   sink:
+     type: doris
+     fenodes: 127.0.0.1:8030
+     username: root
+     password: ""
+
+   pipeline:
+     name: Sync MySQL Database to Doris
+     parallelism: 2
+```
+
+## With optional
+We could use following yaml file to define a complicated Data Pipeline 
describing synchronize all tables under MySQL app_db database to Doris and give 
specific target database name ods_db and specific target table name prefix ods_ 
:
+
+```yaml
+   source:
+     type: mysql
+     hostname: localhost
+     port: 3306
+     username: root
+     password: 123456
+     tables: app_db.\.*
+
+   sink:
+     type: doris
+     fenodes: 127.0.0.1:8030
+     username: root
+     password: ""
+   route:
+     - source-table: app_db.orders
+       sink-table: ods_db.ods_orders
+     - source-table: app_db.shipments
+       sink-table: ods_db.ods_shipments
+     - source-table: app_db.products
+       sink-table: ods_db.ods_products  
+
+   pipeline:
+     name: Sync MySQL Database to Doris
+     parallelism: 2
+```
+
+# Pipeline Configurations
+The following config options of Data Pipeline level are supported:
+
+| parameter       | meaning                                                    
                             | optional/required |
+|-----------------|-----------------------------------------------------------------------------------------|-------------------|
+| name            | The name of the pipeline, which will be submitted to the 
Flink cluster as the job name. | optional          |
+| parallelism     | The global parallelism of the pipeline.                    
                             | required          |
+| local-time-zone | The local time zone defines current session time zone id.  
                             | optional          |
\ No newline at end of file
diff --git a/docs/content/docs/core-concept/data-sink.md 
b/docs/content/docs/core-concept/data-sink.md
index 9c86f00f6..2dab1dc4a 100644
--- a/docs/content/docs/core-concept/data-sink.md
+++ b/docs/content/docs/core-concept/data-sink.md
@@ -23,3 +23,28 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+# Definition
+**Data Sink** is used to apply schema changes and write change data to 
external systems.    
+A Data Sink can write to multiple tables simultaneously.
+
+# Parameters
+To describe a data sink, the follows are required:
+
+| parameter                   | meaning                                        
                                                 | optional/required |
+|-----------------------------|-------------------------------------------------------------------------------------------------|-------------------|
+| type                        | The type of the sink, such as doris or 
starrocks.                                               | required          |
+| name                        | The name of the sink, which is user-defined (a 
default value provided).                         | optional          |
+| configurations of Data Sink | Configurations to build the Data Sink e.g. 
connection configurations and sink table properties. | optional          |
+
+# Example
+We could use this yaml file to define a doris sink:
+```yaml
+sink:
+    type: doris
+    name: doris-sink                   # Optional parameter for description 
purpose
+    fenodes: 127.0.0.1:8030
+    username: root
+    password: ""
+    table.create.properties.replication_num: 1         # Optional parameter 
for advanced functionalities
+```
\ No newline at end of file
diff --git a/docs/content/docs/core-concept/data-source.md 
b/docs/content/docs/core-concept/data-source.md
index d2859bd58..5d6c33deb 100644
--- a/docs/content/docs/core-concept/data-source.md
+++ b/docs/content/docs/core-concept/data-source.md
@@ -23,3 +23,29 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+# Definition
+**Data Source** is used to access metadata and read the changed data from 
external systems.   
+A Data Source can read data from multiple tables simultaneously.
+
+# Parameters
+To describe a data source, the follows are required:
+
+| parameter                     | meaning                                      
                                                       | optional/required |
+|-------------------------------|-----------------------------------------------------------------------------------------------------|-------------------|
+| type                          | The type of the source, such as mysql.       
                                                       | required          |
+| name                          | The name of the source, which is 
user-defined (a default value provided).                           | optional   
       |
+| configurations of Data Source | Configurations to build the Data Source e.g. 
connection configurations and source table properties. | optional          |
+
+# Example
+We could use yaml files to define a mysql source:
+```yaml
+source:
+    type: mysql
+    name: mysql-source   #optional,description information
+    host: localhost
+    port: 3306
+    username: admin
+    password: pass
+    tables: adb.*, bdb.user_table_[0-9]+, [app|web]_order_\.*
+```
\ No newline at end of file
diff --git a/docs/content/docs/core-concept/route.md 
b/docs/content/docs/core-concept/route.md
index 9dbe80c03..0a8c906fb 100644
--- a/docs/content/docs/core-concept/route.md
+++ b/docs/content/docs/core-concept/route.md
@@ -23,3 +23,52 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+# Definition
+**Route** specifies the rule of matching a list of source-table and mapping to 
sink-table. The most typical scenario is the merge of sub-databases and 
sub-tables, routing multiple upstream source tables to the same sink table.
+
+# Parameters
+To describe a route, the follows are required:  
+
+| parameter    | meaning                                            | 
optional/required |
+|--------------|----------------------------------------------------|-------------------|
+| source-table | Source table id, supports regular expressions      | required 
         |
+| sink-table   | Sink table id, supports regular expressions        | required 
         |
+| description  | Routing rule description(a default value provided) | optional 
         |
+
+A route module can contain a list of source-table/sink-table rules.
+
+# Example
+## Route one Data Source table to one Data Sink table
+if synchronize the table `web_order` in the database `mydb` to a Doris table 
`ods_web_order`, we can use this yaml file to define this route:
+
+```yaml
+route:
+    source-table: mydb.web_order
+    sink-table: mydb.ods_web_order
+    description: sync table to one destination table with given prefix ods_
+```
+
+## Route multiple Data Source tables to one Data Sink table
+What's more, if you want to synchronize the sharding tables in the database 
`mydb` to a Doris table `ods_web_order`, we can use this yaml file to define 
this route:
+```yaml
+route:
+    source-table: mydb\.*
+    sink-table: mydb.ods_web_order
+    description: sync sharding tables to one destination table
+```
+
+## Complex Route via combining route rules
+What's more, if you want to specify many different mapping rules, we can use 
this yaml file to define this route:
+```yaml
+route:
+  - source-table: mydb.orders
+    sink-table: ods_db.ods_orders
+    description: sync orders table to orders
+  - source-table: mydb.shipments
+    sink-table: ods_db.ods_shipments
+    description: sync shipments table to ods_shipments
+  - source-table: mydb.products
+    sink-table: ods_db.ods_products
+    description: sync products table to ods_products
+```
\ No newline at end of file
diff --git a/docs/content/docs/core-concept/table-id.md 
b/docs/content/docs/core-concept/table-id.md
index 83769301c..261c8fd09 100644
--- a/docs/content/docs/core-concept/table-id.md
+++ b/docs/content/docs/core-concept/table-id.md
@@ -23,3 +23,18 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+# Definition
+When connecting to external systems, it is necessary to establish a mapping 
relationship with the storage objects of the external system. This is what 
**Table Id** refers to.
+
+# Example
+To be compatible with most external systems, the Table Id is represented by a 
3-tuple : (namespace, schemaName, tableName).   
+Connectors should establish the mapping between Table Id and storage objects 
in external systems.
+
+The following table lists the parts in table Id of different data systems:
+
+| data system           | parts in tableId         | String example      |
+|-----------------------|--------------------------|---------------------|
+| Oracle/PostgreSQL     | database, schema, table  | mydb.default.orders |
+| MySQL/Doris/StarRocks | database, table          | mydb.orders         |
+| Kafka                 | topic                    | orders              |
diff --git a/docs/content/docs/core-concept/transform.md 
b/docs/content/docs/core-concept/transform.md
index 76015dea1..0ffa24829 100644
--- a/docs/content/docs/core-concept/transform.md
+++ b/docs/content/docs/core-concept/transform.md
@@ -23,3 +23,10 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+# Definition
+**Transform** module helps users delete and expand data columns based on the 
data columns in the table.      
+What's more, it also helps users filter some unnecessary data during the 
synchronization process.
+
+# Example
+This feature will support soon.
\ No newline at end of file

Reply via email to