Etienne Pelissier created CALCITE-7489:
------------------------------------------
Summary: ProjectJoinTransposeRule produces row-type mismatch when
pushing a cross-type nullability-narrowing CAST through an outer Join
Key: CALCITE-7489
URL: https://issues.apache.org/jira/browse/CALCITE-7489
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.41.0
Reporter: Etienne Pelissier
CALCITE-4982 added a default {{preserveExprCondition}} guard in
{{ProjectJoinTransposeRule}} that prevents pushing certain
nullability-narrowing CASTs through a Join. The guard's condition is:
{code:java}
if (relType.getSqlTypeName() == operand0Type.getSqlTypeName()
&& operand0Type.isNullable() && !relType.isNullable()) {
return false;
}
{code}
The {{getSqlTypeName()}} equality check restricts the guard to CASTs where
source and target share a base type (e.g. nullable {{VARCHAR(10)}} →
{{{}VARCHAR(10) NOT NULL{}}}). A CAST that *also* changes the SQL type — say
nullable {{INTEGER}} → {{VARCHAR(10) NOT NULL}} — bypasses the guard entirely,
even though it suffers from exactly the same row-type-mismatch failure when
pushed through an outer Join: the post-pushdown column is widened to nullable
by the join, while the cast's declared NOT NULL output type is unchanged, and
{{RelOptUtil.verifyTypeEquivalence}} aborts the rule with a {{{}Type
mismatch{}}}.
h2. Reproducer
Like CALCITE-4982 / CALCITE-7488, cross-type narrowing CASTs are not naturally
producible from Calcite's own SQL parser, but they arise in practice from rel
trees built by SQL frontends with stronger type inference than Calcite's,
encoding their inference as {{makeCast(matchNullability=false)}} or as
Substrait {{NULLABILITY_REQUIRED}} casts.
A minimal reproducer using {{RelBuilder}} directly:
{code:java}
final Function<RelBuilder, RelNode> relFn = b -> {
final RexBuilder rb = b.getRexBuilder();
b.scan("EMP")
.scan("DEPT")
.join(JoinRelType.LEFT,
b.equals(b.field(2, 0, "DEPTNO"), b.field(2, 1, "DEPTNO")));
// d.deptno post-LEFT-JOIN is nullable INTEGER. Cast it to NOT NULL
// VARCHAR(10) — both type AND nullability change.
final RelDataType varchar10NotNull =
b.getTypeFactory()
.createTypeWithNullability(
b.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 10),
false);
return b.project(
rb.makeCast(varchar10NotNull, b.field("DEPT", "DEPTNO"), false,
false))
.build();
};
relFn(relFn).withRule(CoreRules.PROJECT_JOIN_TRANSPOSE).check();
{code}
planBefore (well-typed input):
{code:java}
LogicalProject(DEPTNO=[CAST($8):VARCHAR(10) NOT NULL])
LogicalJoin(condition=[=($7, $8)], joinType=[left])
LogicalTableScan(table=[[scott, EMP]])
LogicalTableScan(table=[[scott, DEPT]])
{code}
Stack on main:
{code:java}
Type mismatch:
rowtype of original rel: RecordType(VARCHAR(10) NOT NULL DEPTNO) NOT NULL
rowtype of new rel: RecordType(VARCHAR(10) DEPTNO) NOT NULL
Difference:
DEPTNO: VARCHAR(10) NOT NULL -> VARCHAR(10)
at org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:436)
at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:58)
at
org.apache.calcite.rel.rules.ProjectJoinTransposeRule.onMatch(ProjectJoinTransposeRule.java:155)
{code}
This is the same {{Type mismatch}} CALCITE-4982 fixed for the same-base-type
case.
h2. Proposed fix
Drop the {{getSqlTypeName()}} equality check from the guard. The corrected
condition refuses any nullability-narrowing CAST, regardless of whether it also
changes the SQL type:
{code:java}
if (operand0Type.isNullable() && !relType.isNullable()) {
return false;
}
{code}
Non-narrowing cross-type casts (e.g. {{CAST(VARCHAR AS INTEGER)}} where the
target is nullable, or where the input is already not-null) are unchanged —
they still pass the guard.
h2. Considered and rejected alternative
Limiting the guard to same-base-type narrowing CASTs (the current behavior) was
likely a pragmatic scoping choice in CALCITE-4982, since the SQL idioms that
originally surfaced the bug (COALESCE / IS-NOT-NULL guarded casts) all produce
same-type narrowing. With the bug now empirically demonstrated for cross-type
casts as well, that scoping no longer matches the underlying soundness concern.
h2. Relationship to sibling JIRA
This is one of two independent gaps in CALCITE-4982's guard:
* This JIRA: the {{getSqlTypeName()}} equality is too narrow — broaden the
trigger.
* Sibling Jira (CALCITE-7488): the check runs only on the top-level expression
— recurse into compound expressions like CASE/COALESCE.
Both can probably be addressed in a single PR (the trigger-condition fix and
the recursion are orthogonal, ~5 lines each in
{{{}Config.DEFAULT.preserveExprCondition{}}}) or in two PRs (one per concern).
Happy to follow whichever approach reviewers prefer; please indicate on either
JIRA.
h2. Implementation status
Local branch ready with the recursive-compound-expression fix from the sibling
JIRA. Adding this fix is a one-line edit (drop the {{getSqlTypeName()}} clause)
plus a {{{}relFn{}}}-based regression test covering cross-type narrowing. Will
fold into the existing branch or open a separate PR depending on reviewer
preference.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)