Etienne Pelissier created CALCITE-7488:
------------------------------------------
Summary: ProjectJoinTransposeRule produces row-type mismatch when
pushing a compound expression containing a nullability-narrowing CAST through
an outer Join
Key: CALCITE-7488
URL: https://issues.apache.org/jira/browse/CALCITE-7488
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.41.0
Reporter: Etienne Pelissier
CALCITE-4982 added a default {{preserveExprCondition}} guard in
{{ProjectJoinTransposeRule}} that prevents pushing a top-level {{CAST(nullable
AS NOT NULL)}} (same {{SqlTypeName}}) through a Join. Pushing such a cast to
the right side of a LEFT JOIN moves it into a position where the post-pushdown
column type is widened to nullable by the join, leaving the cast's declared NOT
NULL type inconsistent with the new rowtype — Calcite then aborts the rule with
a {{Type mismatch}} from {{RelOptUtil.verifyTypeEquivalence}}.
The CALCITE-4982 guard checks only the top-level expression's {{SqlKind}}. The
same soundness issue arises when the nullability-narrowing CAST is nested
inside a compound expression such as {{CASE}} / {{COALESCE}}: pushing the
compound expression atomically through the Join moves the embedded CAST into
the same problematic position, but the existing guard does not see it because
the top-level kind is {{CASE}}, not {{CAST}}.
The strongness check in {{PushProjector}} does not catch this case either: a
{{CASE}} whose branches are casts of right-side fields is strong w.r.t. those
fields (a runtime null in any branch propagates as null), so it is treated as
safe to preserve across the LEFT JOIN — the strongness check is about runtime
null-propagation, not declared-type consistency.
h2. Reproducer
The bug is not naturally producible from Calcite's own SQL parser, because
Calcite's parser computes cast nullability consistently with source nullability
and never produces a CAST whose declared output narrows the source's runtime
nullability. It IS produced in practice by SQL frontends with stronger type
inference than Calcite's, encoding their inference into the rel tree via
{{RexBuilder.makeCast(..., matchNullability=false)}} (or via Substrait casts
with {{NULLABILITY_REQUIRED}}). One concrete example: CockroachDB recognizes a
self-join with {{USING(key)}} as guaranteed-matching and asserts post-join
columns are NOT NULL, encoding this in Substrait casts; the Substrait→Calcite
converter then surfaces a well-typed-stricter rel tree to the optimizer, which
{{ProjectJoinTransposeRule}} corrupts.
A minimal reproducer using {{RelBuilder}} directly (this is the regression test
added in the local branch):
{code:java}
final Function<RelBuilder, RelNode> relFn = b -> {
final RexBuilder rb = b.getRexBuilder();
b.scan("EMP")
.scan("DEPT")
.join(JoinRelType.LEFT,
b.equals(b.field(2, 0, "DEPTNO"), b.field(2, 1, "DEPTNO")));
final RelDataType dnameNotNull =
b.getTypeFactory()
.createTypeWithNullability(b.field("DEPT", "DNAME").getType(), false);
final RelDataType locNotNull =
b.getTypeFactory()
.createTypeWithNullability(b.field("DEPT", "LOC").getType(), false);
return b.project(
b.call(SqlStdOperatorTable.CASE,
b.call(SqlStdOperatorTable.IS_NOT_NULL, b.field("DEPT", "DNAME")),
rb.makeCast(dnameNotNull, b.field("DEPT", "DNAME"), false, false),
rb.makeCast(locNotNull, b.field("DEPT", "LOC"), false, false)))
.build();
};
relFn(relFn).withRule(CoreRules.PROJECT_JOIN_TRANSPOSE).check();
{code}
planBefore (well-typed input):
{code}
LogicalProject($f0=[CASE(IS NOT NULL($9), CAST($9):VARCHAR(14) NOT NULL,
CAST($10):VARCHAR(13) NOT NULL)])
LogicalJoin(condition=[=($7, $8)], joinType=[left])
LogicalTableScan(table=[[scott, EMP]])
LogicalTableScan(table=[[scott, DEPT]])
{code}
Stack on main:
{code}
Type mismatch:
rowtype of original rel: RecordType(VARCHAR(14) NOT NULL $f0) NOT NULL
rowtype of new rel: RecordType(VARCHAR(14) $f0) NOT NULL
Difference:
$f0: VARCHAR(14) NOT NULL -> VARCHAR(14)
at org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:436)
at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:58)
at
org.apache.calcite.rel.rules.ProjectJoinTransposeRule.onMatch(ProjectJoinTransposeRule.java:155)
{code}
h2. Proposed fix
Extend the existing {{preserveExprCondition}} guard to walk the expression tree
recursively, rejecting any expression whose subtree contains a
nullability-narrowing CAST:
{code:java}
.withPreserveExprCondition(expr -> {
if (expr instanceof RexOver) {
return false;
}
if (containsNullabilityNarrowingCast(expr)) {
return false;
}
return true;
})
static boolean containsNullabilityNarrowingCast(RexNode expr) {
if (expr.getKind() == SqlKind.CAST) {
final RexCall castCall = (RexCall) expr;
final RelDataType outputType = castCall.getType();
final RelDataType inputType = castCall.getOperands().get(0).getType();
if (outputType.getSqlTypeName() == inputType.getSqlTypeName()
&& inputType.isNullable() && !outputType.isNullable()) {
return true;
}
}
if (expr instanceof RexCall) {
for (RexNode operand : ((RexCall) expr).getOperands()) {
if (containsNullabilityNarrowingCast(operand)) {
return true;
}
}
}
return false;
}
{code}
After the fix, the rule falls back to pushing only raw input refs through the
Join, leaving the compound expression on top — same conservative-but-correct
behavior CALCITE-4982 already provides for the top-level CAST case.
h2. Considered and rejected alternative
Refusing the rule match entirely on outer joins would regress unrelated
optimizations. Adjusting the cast's declared type during transformation to
match the post-pushdown rowtype (rather than refusing) is theoretically
possible but requires non-trivial expression rewriting and changes user-visible
behavior; refusing is consistent with the existing CALCITE-4982 approach and
minimal.
h2. Implementation status
Local branch ready with the fix and a {{relFn}}-based regression test sibling
to {{testPushProjectPastOutJoinWithCastNonNullExpr}}. Will open a PR once a
JIRA number is assigned.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)