mihaibudiu commented on code in PR #4814:
URL: https://github.com/apache/calcite/pull/4814#discussion_r2875065788


##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,54 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
     // Now the left join
     builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
 
+    // for multi-column IN with nullable RHS columns, the global ck < c check 
is
+    // insufficient. Consider:
+    //
+    //   (empno, deptno) IN (select empno, deptno from ...)
+    //
+    // where one RHS row is (7521, null):
+    //   empno=7521: AND(TRUE,  UNKNOWN) = UNKNOWN  → result should be null

Review Comment:
   I also don't understand what this AND is



##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,54 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
     // Now the left join
     builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
 
+    // for multi-column IN with nullable RHS columns, the global ck < c check 
is
+    // insufficient. Consider:
+    //
+    //   (empno, deptno) IN (select empno, deptno from ...)
+    //
+    // where one RHS row is (7521, null):
+    //   empno=7521: AND(TRUE,  UNKNOWN) = UNKNOWN  → result should be null
+    //   empno=7369: AND(FALSE, UNKNOWN) = FALSE    → result should be false
+    //
+    // ck < c only tells us "some RHS row has a null", but cannot tell which
+    // LHS keys are affected. Instead, we LEFT JOIN a null-rows table (nr) of
+    // RHS rows that have at least one NULL column, using per-column condition
+    // (col IS NULL OR key = col), where a NULL col is a wildcard.
+    // nr.i IS NOT NULL means a null-row matched the LHS key, result is 
UNKNOWN.
+    // nr.i IS NULL means no null-row matched, fall through to false.
+    String nrAlias = "nr";
+    if (subQueryIndex != 0) {
+      nrAlias = "nr" + subQueryIndex;
+    }
+    // Only needed for multi-column, non-literal IN with null-safety logic.
+    // Single-column cases are handled correctly by the existing ck < c check.

Review Comment:
   What happens if the single column has a ROW type?



##########
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java:
##########
@@ -787,6 +788,54 @@ private static RexNode rewriteIn(RexSubQuery e, 
Set<CorrelationId> variablesSet,
     // Now the left join
     builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);
 
+    // for multi-column IN with nullable RHS columns, the global ck < c check 
is

Review Comment:
   The comment here needs some context: the actual comparison `ck < c` is much 
lower.
   Perhaps you can first explain how the expansion is handled for scalar values.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to