[ 
https://issues.apache.org/jira/browse/FLINK-35152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tumengyao updated FLINK-35152:
------------------------------
    Description: 
In some scenarios, when creating a physical table in Doris, appropriate 
partition fields need to be selected to speed up the efficiency of data query 
and calculation. In addition, partition tables support more applications, such 
as hot and cold data layering and so on.


The current Flink CDC Doris Sink's create table event creates a table with no 
partitions set.


The Auto Partition function supported by doris 2.1.x simplifies the creation 
and management of partitions. We just need to add some configuration items to 
the Flink CDC job. To tell Flink CDC which fields Doris Sink will use in the 
create table event to create partitions, you can get a partition table in Doris.

Here's an example:
source: Mysql
source_table:
CREATE TABLE table1 (
col1 INT AUTO_INCREMENT PRIMARY KEY,
col2 DECIMAL(18, 2),
col3 VARCHAR(500),
col4 TEXT,
col5 DATETIME DEFAULT CURRENT_TIMESTAMP
);


If you want to specify the partition of table test.table1, you need to add 
sink-table-partition-keys , sink-table-partition-type information ,...., to 
mysql_to_doris.yaml

route:

source-table: test.table1
sink-table:ods.ods_table1
sink-table-partition-key:col5
sink-table-partition-func-call-expr:date_trunc(`col5`, 'month')
sink-table-partition-type:auto range

The auto range partition in Doris 2.1.x does not support null partitions. So 
you need to set test.table1.col5 == null then '1990-01-01 00:00:00' else 
test.table1.col5 end

Now after submitting the mysql_to_doris.ymal Flink CDC job, an ods.ods_table1 
data table should appear in the Doris database
The data table DDL is as follows:
CREATE TABLE table1 (
col1 INT ,
col5 DATETIME not null,
col2 DECIMAL(18, 2),
col3 VARCHAR(500),
col4 STRING
) unique KEY(`col1`,`col5`)
AUTO PARTITION BY RANGE date_trunc(`col5`, 'month')()
DISTRIBUTED BY HASH (`id`) BUCKETS AUTO
PROPERTIES (
...
);

  was:
In some scenarios, when creating a physical table in Doris, appropriate 
partition fields need to be selected to speed up the efficiency of data query 
and calculation. In addition, partition tables support more applications, such 
as hot and cold data layering and so on.
The current Flink CDC Doris Sink's create table event creates a table with no 
partitions set.
The Auto Partition function supported by doris 2.1.x simplifies the creation 
and management of partitions. We just need to add some configuration items to 
the Flink CDC job. To tell Flink CDC which fields Doris Sink will use in the 
create table event to create partitions, you can get a partition table in Doris.

Here's an example:
source: Mysql
source_table:
CREATE TABLE table1 (
col1 INT AUTO_INCREMENT PRIMARY KEY,
col2 DECIMAL(18, 2),
col3 VARCHAR(500),
col4 TEXT,
col5 DATETIME DEFAULT CURRENT_TIMESTAMP
);
If you want to specify the partition of table test.table1, you need to add 
sink-table-partition-keys and sink-table-partition-type information to the 
mysql_to_doris
route:
- source-table: test.table1
sink-table:ods.ods_table1
sink-table-partition-key:col5
sink-table-partition-func-call-expr:date_trunc(`col5`, 'month')
sink-table-partition-type:auto range

The auto range partition in Doris 2.1.x does not support null partitions. So 
you need to set test.table1.col5 == null then '1990-01-01 00:00:00' else 
test.table1.col5 end

Now after submitting the mysql_to_doris.ymal Flink CDC job, an ods.ods_table1 
data table should appear in the Doris database
The data table DDL is as follows:
CREATE TABLE table1 (
col1 INT ,
col5 DATETIME not null,
col2 DECIMAL(18, 2),
col3 VARCHAR(500),
col4 TEXT
) unique KEY(`col1`,`col5`)
AUTO PARTITION BY RANGE date_trunc(`col5`, 'month')()
DISTRIBUTED BY HASH (`id`) BUCKETS AUTO
PROPERTIES (
...
);


> Flink CDC  Doris/Starrocks Sink Auto create table event should support 
> setting auto partition fields for each table
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-35152
>                 URL: https://issues.apache.org/jira/browse/FLINK-35152
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: 3.1.0
>            Reporter: tumengyao
>            Priority: Minor
>              Labels: Doris
>
> In some scenarios, when creating a physical table in Doris, appropriate 
> partition fields need to be selected to speed up the efficiency of data query 
> and calculation. In addition, partition tables support more applications, 
> such as hot and cold data layering and so on.
> The current Flink CDC Doris Sink's create table event creates a table with no 
> partitions set.
> The Auto Partition function supported by doris 2.1.x simplifies the creation 
> and management of partitions. We just need to add some configuration items to 
> the Flink CDC job. To tell Flink CDC which fields Doris Sink will use in the 
> create table event to create partitions, you can get a partition table in 
> Doris.
> Here's an example:
> source: Mysql
> source_table:
> CREATE TABLE table1 (
> col1 INT AUTO_INCREMENT PRIMARY KEY,
> col2 DECIMAL(18, 2),
> col3 VARCHAR(500),
> col4 TEXT,
> col5 DATETIME DEFAULT CURRENT_TIMESTAMP
> );
> If you want to specify the partition of table test.table1, you need to add 
> sink-table-partition-keys , sink-table-partition-type information ,...., to 
> mysql_to_doris.yaml
> route:
> source-table: test.table1
> sink-table:ods.ods_table1
> sink-table-partition-key:col5
> sink-table-partition-func-call-expr:date_trunc(`col5`, 'month')
> sink-table-partition-type:auto range
> The auto range partition in Doris 2.1.x does not support null partitions. So 
> you need to set test.table1.col5 == null then '1990-01-01 00:00:00' else 
> test.table1.col5 end
> Now after submitting the mysql_to_doris.ymal Flink CDC job, an ods.ods_table1 
> data table should appear in the Doris database
> The data table DDL is as follows:
> CREATE TABLE table1 (
> col1 INT ,
> col5 DATETIME not null,
> col2 DECIMAL(18, 2),
> col3 VARCHAR(500),
> col4 STRING
> ) unique KEY(`col1`,`col5`)
> AUTO PARTITION BY RANGE date_trunc(`col5`, 'month')()
> DISTRIBUTED BY HASH (`id`) BUCKETS AUTO
> PROPERTIES (
> ...
> );



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to