Etienne Pelissier created CALCITE-7489:
------------------------------------------

             Summary: ProjectJoinTransposeRule produces row-type mismatch when 
pushing a cross-type nullability-narrowing CAST through an outer Join
                 Key: CALCITE-7489
                 URL: https://issues.apache.org/jira/browse/CALCITE-7489
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.41.0
            Reporter: Etienne Pelissier


CALCITE-4982 added a default {{preserveExprCondition}} guard in 
{{ProjectJoinTransposeRule}} that prevents pushing certain 
nullability-narrowing CASTs through a Join. The guard's condition is:
{code:java}
if (relType.getSqlTypeName() == operand0Type.getSqlTypeName()
    && operand0Type.isNullable() && !relType.isNullable()) {
  return false;
}
{code}
The {{getSqlTypeName()}} equality check restricts the guard to CASTs where 
source and target share a base type (e.g. nullable {{VARCHAR(10)}} → 
{{{}VARCHAR(10) NOT NULL{}}}). A CAST that *also* changes the SQL type — say 
nullable {{INTEGER}} → {{VARCHAR(10) NOT NULL}} — bypasses the guard entirely, 
even though it suffers from exactly the same row-type-mismatch failure when 
pushed through an outer Join: the post-pushdown column is widened to nullable 
by the join, while the cast's declared NOT NULL output type is unchanged, and 
{{RelOptUtil.verifyTypeEquivalence}} aborts the rule with a {{{}Type 
mismatch{}}}.
h2. Reproducer

Like CALCITE-4982 / CALCITE-7488, cross-type narrowing CASTs are not naturally 
producible from Calcite's own SQL parser, but they arise in practice from rel 
trees built by SQL frontends with stronger type inference than Calcite's, 
encoding their inference as {{makeCast(matchNullability=false)}} or as 
Substrait {{NULLABILITY_REQUIRED}} casts.

A minimal reproducer using {{RelBuilder}} directly:
{code:java}
final Function<RelBuilder, RelNode> relFn = b -> {
  final RexBuilder rb = b.getRexBuilder();
  b.scan("EMP")
      .scan("DEPT")
      .join(JoinRelType.LEFT,
          b.equals(b.field(2, 0, "DEPTNO"), b.field(2, 1, "DEPTNO")));
  // d.deptno post-LEFT-JOIN is nullable INTEGER. Cast it to NOT NULL
  // VARCHAR(10) — both type AND nullability change.
  final RelDataType varchar10NotNull =
      b.getTypeFactory()
          .createTypeWithNullability(
              b.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 10),
              false);
  return b.project(
          rb.makeCast(varchar10NotNull, b.field("DEPT", "DEPTNO"), false, 
false))
      .build();
};
relFn(relFn).withRule(CoreRules.PROJECT_JOIN_TRANSPOSE).check();
{code}
planBefore (well-typed input):
{code:java}
LogicalProject(DEPTNO=[CAST($8):VARCHAR(10) NOT NULL])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
{code}
Stack on main:
{code:java}
Type mismatch:
rowtype of original rel: RecordType(VARCHAR(10) NOT NULL DEPTNO) NOT NULL
rowtype of new rel:      RecordType(VARCHAR(10) DEPTNO) NOT NULL
Difference:
DEPTNO: VARCHAR(10) NOT NULL -> VARCHAR(10)

at org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:436)
at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:58)
at 
org.apache.calcite.rel.rules.ProjectJoinTransposeRule.onMatch(ProjectJoinTransposeRule.java:155)
{code}
This is the same {{Type mismatch}} CALCITE-4982 fixed for the same-base-type 
case.
h2. Proposed fix

Drop the {{getSqlTypeName()}} equality check from the guard. The corrected 
condition refuses any nullability-narrowing CAST, regardless of whether it also 
changes the SQL type:
{code:java}
if (operand0Type.isNullable() && !relType.isNullable()) {
  return false;
}
{code}
Non-narrowing cross-type casts (e.g. {{CAST(VARCHAR AS INTEGER)}} where the 
target is nullable, or where the input is already not-null) are unchanged — 
they still pass the guard.
h2. Considered and rejected alternative

Limiting the guard to same-base-type narrowing CASTs (the current behavior) was 
likely a pragmatic scoping choice in CALCITE-4982, since the SQL idioms that 
originally surfaced the bug (COALESCE / IS-NOT-NULL guarded casts) all produce 
same-type narrowing. With the bug now empirically demonstrated for cross-type 
casts as well, that scoping no longer matches the underlying soundness concern.
h2. Relationship to sibling JIRA

This is one of two independent gaps in CALCITE-4982's guard:
 * This JIRA: the {{getSqlTypeName()}} equality is too narrow — broaden the 
trigger.
 * Sibling Jira (CALCITE-7488): the check runs only on the top-level expression 
— recurse into compound expressions like CASE/COALESCE.

Both can probably be addressed in a single PR (the trigger-condition fix and 
the recursion are orthogonal, ~5 lines each in 
{{{}Config.DEFAULT.preserveExprCondition{}}}) or in two PRs (one per concern). 
Happy to follow whichever approach reviewers prefer; please indicate on either 
JIRA.
h2. Implementation status

Local branch ready with the recursive-compound-expression fix from the sibling 
JIRA. Adding this fix is a one-line edit (drop the {{getSqlTypeName()}} clause) 
plus a {{{}relFn{}}}-based regression test covering cross-type narrowing. Will 
fold into the existing branch or open a separate PR depending on reviewer 
preference.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to