[
https://issues.apache.org/jira/browse/CALCITE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated CALCITE-7607:
------------------------------------
Labels: pull-request-available (was: )
> DML type coercion inserts implicit narrowing casts for target-column
> assignments
> --------------------------------------------------------------------------------
>
> Key: CALCITE-7607
> URL: https://issues.apache.org/jira/browse/CALCITE-7607
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: zzwqqq
> Assignee: zzwqqq
> Priority: Major
> Labels: pull-request-available
>
> For {{{}INSERT{}}}, {{{}UPDATE{}}}, and {{{}MERGE{}}}, Calcite may coerce a
> source expression to the target column type if the two types are
> cast-compatible. This happens even when the SQL does not contain an explicit
> {{{}CAST{}}}.
> That can change DML assignment semantics. A target table may need to reject
> the value, or may need to apply its own checked assignment rules. Calcite
> should not turn a bare source expression into an ordinary cast unless the
> assignment is safe to rewrite.
> This includes length, precision, and range narrowing. For example, inserting
> {{VARCHAR(64)}} into {{{}VARCHAR(16){}}}, {{DECIMAL(18, 4)}} into
> {{{}DECIMAL(10, 2){}}}, or {{BIGINT}} into {{INTEGER}} should not be hidden
> by an implicit cast in the DML plan. Several databases reject those
> assignments when the actual value does not fit the target column.
> Tests in postgresql, mysql and oracle:
> [https://onecompiler.com/postgresql/44sf3dz68]
> [https://onecompiler.com/mysql/44sf3efz4]
> [https://onecompiler.com/oracle/44sf3ewmh]
> h3. Steps to Reproduce
> Add this case in {{{}TypeCoercionConverterTest{}}}:
> {code:java}
> @Test void testInsertVarcharNarrowingKeepsSourceType() {
> final String sql = "insert into t1 select "
> + "CAST('AVeryLongLongStringValue' AS VARCHAR(64)), "
> + "t2_smallint, t2_int, t2_bigint,\n"
> + "t2_real, t2_double, t2_decimal, t2_timestamp, "
> + "t2_date, t2_binary, t2_boolean from t2";
> sql(sql).ok();
> }{code}
> The plan is
> {code:java}
> LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT],
> flattened=[false])
> LogicalProject(t1_varchar20=['AVeryLongLongStringV':VARCHAR(20)],
> t1_smallint=[$1], t1_int=[$2], t1_bigint=[$3], t1_real=[$4], t1_double=[$5],
> t1_decimal=[$6], t1_timestamp=[$7], t1_date=[$8], t1_binary=[$9],
> t1_boolean=[$10])
> LogicalTableScan(table=[[CATALOG, SALES, T2]])
> {code}
> The source expression is {{{}CAST('AVeryLongLongStringValue' AS
> VARCHAR(64)){}}}, but the plan has already truncated it to
> {{{}'AVeryLongLongStringV':VARCHAR(20){}}}. This bypasses target-column
> assignment checks.
> Desired behavior: the plan keeps the source type:
> {code:java}
> LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT],
> flattened=[false])
> LogicalProject(t1_varchar20=['AVeryLongLongStringValue':VARCHAR(64)],
> t1_smallint=[$1], t1_int=[$2], t1_bigint=[$3], t1_real=[$4], t1_double=[$5],
> t1_decimal=[$6], t1_timestamp=[$7], t1_date=[$8], t1_binary=[$9],
> t1_boolean=[$10])
> LogicalTableScan(table=[[CATALOG, SALES, T2]])
> {code}
> h3. Possible Fix
> One possible approach is to keep store assignment separate from ordinary cast
> coercion:
> * In validation, use assignment compatibility for bare DML sources, and
> leave ordinary cast compatibility to user-written {{CAST}} expressions.
> * In query-source coercion, rewrite only assignments that are known to be
> safe to rewrite.
> * Preserve source/target assignment metadata through {{TableModify}} so
> later planning and execution stages can still see the original source type.
> * For Calcite's enumerable execution path, reject kept-source values that do
> not fit the target column.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)