Thomas Powell created SPARK-48360:
-------------------------------------

             Summary: Simplify conditionals containing predicates
                 Key: SPARK-48360
                 URL: https://issues.apache.org/jira/browse/SPARK-48360
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 3.4.3
            Reporter: Thomas Powell


The catalyst optimizer has many optimizations for {{CaseWhen}} and {{If}} 
expressions, that can eliminate branches entirely or replace them with Boolean 
logic. There are additional "always false" conditionals that could be 
eliminated entirely. It would also be possible to replace conditionals with 
Boolean logic where the {{if-branch}} and {{else-branch}} are themselves 
predicates. The primary motivation would be to push more filters to the 
datasource.

For example:
{code:java}
Filter(If(GreaterThan(a, 2), false, LessThanOrEqual(b <= 4))){code}
is equivalent to
{code:java}
# a not nullable
Filter(And(LessThanOrEqual(a, 2), LessThanOrEqual(b, 4))

# a nullable
Filter(And(Not(EqualNotSafe(GreaterThan(a, 2), true), LessThanOrEqual(b, 4)))) 
{code}
Within a filter the nullability handling is admittedly less important since the 
expression evaluating to null would be semantically equivalent to false, but 
the original conditional may have been intentionally written to not return null 
when {{a}} may be null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to