[ 
https://issues.apache.org/jira/browse/CALCITE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089596#comment-18089596
 ] 

zzwqqq edited comment on CALCITE-7607 at 6/17/26 9:45 AM:
----------------------------------------------------------

I am trying a smaller validator-side change first, by adding 
RexUtil.isLosslessCast in TypeCoercionImpl#querySourceCoercion, so validation 
does not add a narrowing CAST for a DML source assignment,
{code:java}
  @Override public boolean querySourceCoercion(@Nullable SqlValidatorScope 
scope,
      RelDataType sourceRowType, RelDataType targetRowType, SqlNode query) {
    final List<RelDataTypeField> sourceFields = sourceRowType.getFieldList();
    final List<RelDataTypeField> targetFields = targetRowType.getFieldList();
    final int sourceCount = sourceFields.size();
    for (int i = 0; i < sourceCount; i++) {
      RelDataType sourceType = sourceFields.get(i).getType();
      RelDataType targetType = targetFields.get(i).getType();
      if (!SqlTypeUtil.equalSansNullability(validator.getTypeFactory(), 
sourceType, targetType)
          && !SqlTypeUtil.canAssignFrom(targetType, sourceType)) {
        // DML source coercion follows assignment compatibility. A source that
        // is only cast-compatible needs an explicit CAST in SQL.
        return false;
      }
    }
    boolean coerced = false;
    for (int i = 0; i < sourceFields.size(); i++) {
      RelDataType sourceType = sourceFields.get(i).getType();
      RelDataType targetType = targetFields.get(i).getType();
      if (!SqlTypeUtil.equalSansNullability(validator.getTypeFactory(), 
sourceType, targetType)
          && RexUtil.isLosslessCast(sourceType, targetType)) {
        coerced = coerceSourceRowType(scope, query, i, targetType) || coerced;
      }
    }
    return coerced;
  }
{code}
I found at least one remaining planner-side path though. If CoerceInputsRule is 
applied to LogicalTableModify, it uses TableModify#getExpectedInputRowType. For 
INSERT, that is the target table row type, so RelOptUtil#createCastRel can add 
the narrowing CAST back.

I am not sure whether this belongs in CoerceInputsRule, by making it 
assignment-aware for TableModify, or in TableModify#getExpectedInputRowType, 
perhaps by passing it the original input row type. There may also be other 
planner paths I have not found yet.

I’d appreciate any advice on which direction fits best.


was (Author: JIRAUSER311718):
I am trying a smaller validator-side change first, by adding 
RexUtil.isLosslessCast in TypeCoercionImpl#querySourceCoercion, so validation 
does not add a narrowing CAST for a DML source assignment,
{code:java}
  @Override public boolean querySourceCoercion(@Nullable SqlValidatorScope 
scope,
      RelDataType sourceRowType, RelDataType targetRowType, SqlNode query) {
    final List<RelDataTypeField> sourceFields = sourceRowType.getFieldList();
    final List<RelDataTypeField> targetFields = targetRowType.getFieldList();
    final int sourceCount = sourceFields.size();
    for (int i = 0; i < sourceCount; i++) {
      RelDataType sourceType = sourceFields.get(i).getType();
      RelDataType targetType = targetFields.get(i).getType();
      if (!SqlTypeUtil.equalSansNullability(validator.getTypeFactory(), 
sourceType, targetType)
          && !SqlTypeUtil.canAssignFrom(targetType, sourceType)) {
        // DML source coercion follows assignment compatibility. A source that
        // is only cast-compatible needs an explicit CAST in SQL.
        return false;
      }
    }
    boolean coerced = false;
    for (int i = 0; i < sourceFields.size(); i++) {
      RelDataType sourceType = sourceFields.get(i).getType();
      RelDataType targetType = targetFields.get(i).getType();
      if (!SqlTypeUtil.equalSansNullability(validator.getTypeFactory(), 
sourceType, targetType)
          && RexUtil.isLosslessCast(sourceType, targetType)) {
        coerced = coerceSourceRowType(scope, query, i, targetType) || coerced;
      }
    }
    return coerced;
  }
{code}
I found at least one remaining planner-side path though. If CoerceInputsRule is 
applied to LogicalTableModify, it uses TableModify#getExpectedInputRowType. For 
INSERT, that is the target table row type, so RelOptUtil#createCastRel can add 
the narrowing CAST back.

I am not sure whether this belongs in CoerceInputsRule, by making it 
assignment-aware for TableModify, or somewhere around 
TableModify#getExpectedInputRowType. There may also be other planner paths I 
have not found yet.

I’d appreciate any advice on which direction fits best.

> DML type coercion inserts implicit narrowing casts for target-column 
> assignments
> --------------------------------------------------------------------------------
>
>                 Key: CALCITE-7607
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7607
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: zzwqqq
>            Assignee: zzwqqq
>            Priority: Major
>
> For {{{}INSERT{}}}, {{{}UPDATE{}}}, and {{{}MERGE{}}}, Calcite may coerce a 
> source expression to the target column type if the two types are 
> cast-compatible. This happens even when the SQL does not contain an explicit 
> {{{}CAST{}}}.
> That can change DML assignment semantics. A target table may need to reject 
> the value, or may need to apply its own checked assignment rules. Calcite 
> should not turn a bare source expression into an ordinary cast unless the 
> assignment is safe to rewrite.
> This includes length, precision, and range narrowing. For example, inserting 
> {{VARCHAR(64)}} into {{{}VARCHAR(16){}}}, {{DECIMAL(18, 4)}} into 
> {{{}DECIMAL(10, 2){}}}, or {{BIGINT}} into {{INTEGER}} should not be hidden 
> by an implicit cast in the DML plan. Several databases reject those 
> assignments when the actual value does not fit the target column.
> Tests in postgresql, mysql and oracle:
> [https://onecompiler.com/postgresql/44sf3dz68]
> [https://onecompiler.com/mysql/44sf3efz4]
> [https://onecompiler.com/oracle/44sf3ewmh]
> h3. Steps to Reproduce
> Add this case in {{{}TypeCoercionConverterTest{}}}:
> {code:java}
> @Test void testInsertVarcharNarrowingKeepsSourceType() {
> final String sql = "insert into t1 select "
> + "CAST('AVeryLongLongStringValue' AS VARCHAR(64)), "
> + "t2_smallint, t2_int, t2_bigint,\n"
> + "t2_real, t2_double, t2_decimal, t2_timestamp, "
> + "t2_date, t2_binary, t2_boolean from t2";
> sql(sql).ok();
> }{code}
> The plan is
> {code:java}
> LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], 
> flattened=[false])
>   LogicalProject(t1_varchar20=['AVeryLongLongStringV':VARCHAR(20)], 
> t1_smallint=[$1], t1_int=[$2], t1_bigint=[$3], t1_real=[$4], t1_double=[$5], 
> t1_decimal=[$6], t1_timestamp=[$7], t1_date=[$8], t1_binary=[$9], 
> t1_boolean=[$10])
>     LogicalTableScan(table=[[CATALOG, SALES, T2]])
>  {code}
> The source expression is {{{}CAST('AVeryLongLongStringValue' AS 
> VARCHAR(64)){}}}, but the plan has already truncated it to 
> {{{}'AVeryLongLongStringV':VARCHAR(20){}}}. This bypasses target-column 
> assignment checks.
> Desired behavior: the plan keeps the source type:
> {code:java}
> LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], 
> flattened=[false])
>   LogicalProject(t1_varchar20=['AVeryLongLongStringValue':VARCHAR(64)], 
> t1_smallint=[$1], t1_int=[$2], t1_bigint=[$3], t1_real=[$4], t1_double=[$5], 
> t1_decimal=[$6], t1_timestamp=[$7], t1_date=[$8], t1_binary=[$9], 
> t1_boolean=[$10])
>     LogicalTableScan(table=[[CATALOG, SALES, T2]])
>  {code}
> h3. Possible Fix
> One possible approach is to keep store assignment separate from ordinary cast 
> coercion:
>  * In validation, use assignment compatibility for bare DML sources, and 
> leave ordinary cast compatibility to user-written {{CAST}} expressions.
>  * In query-source coercion, rewrite only assignments that are known to be 
> safe to rewrite.
>  * Preserve source/target assignment metadata through {{TableModify}} so 
> later planning and execution stages can still see the original source type.
>  * For Calcite's enumerable execution path, reject kept-source values that do 
> not fit the target column.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to