gengliangwang opened a new pull request, #56075: URL: https://github.com/apache/spark/pull/56075
### What changes were proposed in this pull request? This is a sub-task of [SPARK-56908](https://issues.apache.org/jira/browse/SPARK-56908). Two statically-dead patterns in `SortMergeJoinExec` codegen: 1. `genComparison` emits ``` comp = 0; if (comp == 0) { comp = compare(k1); } if (comp == 0) { comp = compare(k2); } ``` The first `if (comp == 0)` is always true (we just assigned 0). Emit `comp = compare(k1);` directly; only wrap subsequent keys. `genComparison` is called 5x per SMJ stage (twice in `genScanner`, three times in `codegenFullOuter`). For single-key joins (common), each call collapses to one line. 2. `genScanner` and `codegenFullOuter` emit `if (k1IsNull || k2IsNull || ...) { handler }`. When all key `ExprValue`s have `isNull == FalseLiteral`, the disjunction is statically `false` and the whole block (including its `handleStreamedAnyNull` / "join with null row" handler) is dead. Detect this and omit the block. Hits fact/dimension joins on numeric keys where Spark has already proved non-nullability. ### Why are the changes needed? Smaller generated Java per SMJ stage. JIT eliminates the dead code at runtime; the win is smaller generated source, more 64KB method-limit headroom, and slightly faster Janino compile. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing test suites cover both paths with whole-stage codegen on and off: - `OuterJoinSuite` (SMJ full-outer codegen + interpreted scanner). - `InnerJoinSuite` (SMJ codegen and non-codegen paths). - `ExistenceJoinSuite` (SMJ existence path). ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
