Thomas Powell created SPARK-48360: ------------------------------------- Summary: Simplify conditionals containing predicates Key: SPARK-48360 URL: https://issues.apache.org/jira/browse/SPARK-48360 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.3 Reporter: Thomas Powell
The catalyst optimizer has many optimizations for {{CaseWhen}} and {{If}} expressions, that can eliminate branches entirely or replace them with Boolean logic. There are additional "always false" conditionals that could be eliminated entirely. It would also be possible to replace conditionals with Boolean logic where the {{if-branch}} and {{else-branch}} are themselves predicates. The primary motivation would be to push more filters to the datasource. For example: {code:java} Filter(If(GreaterThan(a, 2), false, LessThanOrEqual(b <= 4))){code} is equivalent to {code:java} # a not nullable Filter(And(LessThanOrEqual(a, 2), LessThanOrEqual(b, 4)) # a nullable Filter(And(Not(EqualNotSafe(GreaterThan(a, 2), true), LessThanOrEqual(b, 4)))) {code} Within a filter the nullability handling is admittedly less important since the expression evaluating to null would be semantically equivalent to false, but the original conditional may have been intentionally written to not return null when {{a}} may be null. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org