This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 00147a4471d update routine load 00147a4471d is described below commit 00147a4471dd7c31913c5010efab50e54711429f Author: jiafeng.zhang <zhang...@gmail.com> AuthorDate: Wed Aug 24 09:50:54 2022 +0800 update routine load --- .../Load/CREATE-ROUTINE-LOAD.md | 32 ++++++++++++++++++++++ .../Load/CREATE-ROUTINE-LOAD.md | 32 ++++++++++++++++++++++ 2 files changed, 64 insertions(+) diff --git a/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md b/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md index b1873c7d275..c6ec707cf9b 100644 --- a/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md +++ b/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md @@ -161,6 +161,38 @@ FROM data_source [data_source_properties] `"strict_mode" = "true"` + The strict mode mode means strict filtering of column type conversions during the load process. The strict filtering strategy is as follows: + + 1. For column type conversion, if strict mode is true, the wrong data will be filtered. The error data here refers to the fact that the original data is not null, and the result is a null value after participating in the column type conversion. + 2. When a loaded column is generated by a function transformation, strict mode has no effect on it. + 3. For a column type loaded with a range limit, if the original data can pass the type conversion normally, but cannot pass the range limit, strict mode will not affect it. For example, if the type is decimal(1,0) and the original data is 10, it is eligible for type conversion but not for column declarations. This data strict has no effect on it. + + **strict mode and load relationship of source data** + + Here is an example of a column type of TinyInt. + + > Note: When a column in a table allows a null value to be loaded + + | source data | source data example | string to int | strict_mode | result | + | ----------- | ------------------- | ------------- | ------------- | ---------------------- | + | null | \N | N/A | true or false | NULL | + | not null | aaa or 2000 | NULL | true | invalid data(filtered) | + | not null | aaa | NULL | false | NULL | + | not null | 1 | 1 | true or false | correct data | + + Here the column type is Decimal(1,0) + + > Note: When a column in a table allows a null value to be loaded + + | source data | source data example | string to int | strict_mode | result | + | ----------- | ------------------- | ------------- | ------------- | ---------------------- | + | null | \N | N/A | true or false | NULL | + | not null | aaa | NULL | true | invalid data(filtered) | + | not null | aaa | NULL | false | NULL | + | not null | 1 or 10 | 1 | true or false | correct data | + + > Note: 10 Although it is a value that is out of range, because its type meets the requirements of decimal, strict mode has no effect on it. 10 will eventually be filtered in other ETL processing flows. But it will not be filtered by strict mode. + 5. `timezone` Specifies the time zone used by the import job. The default is to use the Session's timezone parameter. This parameter affects the results of all time zone-related functions involved in the import. diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md index 67246fd35c7..6ced9ee91e4 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md @@ -162,6 +162,38 @@ FROM data_source [data_source_properties] `"strict_mode" = "true"` + strict mode 模式的意思是:对于导入过程中的列类型转换进行严格过滤。严格过滤的策略如下: + + 1. 对于列类型转换来说,如果 strict mode 为true,则错误的数据将被 filter。这里的错误数据是指:原始数据并不为空值,在参与列类型转换后结果为空值的这一类数据。 + 2. 对于导入的某列由函数变换生成时,strict mode 对其不产生影响。 + 3. 对于导入的某列类型包含范围限制的,如果原始数据能正常通过类型转换,但无法通过范围限制的,strict mode 对其也不产生影响。例如:如果类型是 decimal(1,0), 原始数据为 10,则属于可以通过类型转换但不在列声明的范围内。这种数据 strict 对其不产生影响。 + + **strict mode 与 source data 的导入关系** + + 这里以列类型为 TinyInt 来举例 + + > 注:当表中的列允许导入空值时 + + | source data | source data example | string to int | strict_mode | result | + | ----------- | ------------------- | ------------- | ------------- | ---------------------- | + | 空值 | \N | N/A | true or false | NULL | + | not null | aaa or 2000 | NULL | true | invalid data(filtered) | + | not null | aaa | NULL | false | NULL | + | not null | 1 | 1 | true or false | correct data | + + 这里以列类型为 Decimal(1,0) 举例 + + > 注:当表中的列允许导入空值时 + + | source data | source data example | string to int | strict_mode | result | + | ----------- | ------------------- | ------------- | ------------- | ---------------------- | + | 空值 | \N | N/A | true or false | NULL | + | not null | aaa | NULL | true | invalid data(filtered) | + | not null | aaa | NULL | false | NULL | + | not null | 1 or 10 | 1 | true or false | correct data | + + > 注意:10 虽然是一个超过范围的值,但是因为其类型符合 decimal的要求,所以 strict mode对其不产生影响。10 最后会在其他 ETL 处理流程中被过滤。但不会被 strict mode 过滤。 + 5. `timezone` 指定导入作业所使用的时区。默认为使用 Session 的 timezone 参数。该参数会影响所有导入涉及的和时区有关的函数结果。 --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org