Jaromir Vanek created SPARK-53996:
-------------------------------------
Summary: InferFiltersFromConstraints rule does not infer filter
conditions for complex join expressions
Key: SPARK-53996
URL: https://issues.apache.org/jira/browse/SPARK-53996
Project: Spark
Issue Type: Improvement
Components: Optimizer
Affects Versions: 4.0.1
Reporter: Jaromir Vanek
The Spark optimizer's {{InferFiltersFromConstraints}} rule is currently unable
to infer filter conditions when join constraints involve complex expressions.
While it works for simple attribute equalities ({{{}a = b{}}}), it can't infer
constraint from anything more complex than that.
*Example (works as expected):*
{code:sql}
SELECT *
FROM t1
JOIN t2 ON t1.a = t2.b
WHERE t2.b = 1
{code}
In this case, the optimizer correctly infers the additional constraint {{{}t1.a
= 1{}}}.
*Example (does not work):*
{code:sql}
SELECT *
FROM t1
JOIN right ON t1.a = t2.b + 2
WHERE t2.b = 1
{code}
In this case, it is clear that {{t1.a = 3}} (since {{b = 1}} and {{{}a = b +
2{}}}), but the optimizer does not infer this constraint.
*How to Reproduce:*
{code:scala}
spark.sql("CREATE TABLE t1(a INT)")
spark.sql("CREATE TABLE t2(b INT)")
spark.sql("""
SELECT *
FROM t1
INNER JOIN t2 ON t2.b = t1.a + 2
WHERE t1.a = 1
""").explain
{code}
{code:java}
== Physical Plan ==
AdaptiveSparkPlan
+- BroadcastHashJoin [(a#2 + 2)], [b#3], Inner, BuildRight, false
:- Filter (isnotnull(a#2) AND (a#2 = 1))
: +- FileScan spark_catalog.default.t1[a#2]
+- Filter isnotnull(b#3)
+- FileScan spark_catalog.default.t2[b#3]
{code}
*Expected Behavior:*
The optimizer should be able to statically evaluate and infer that {{t2.b = 3}}
given the join condition and the filter on {{{}t1.a{}}}.
*Impact:*
This limits the optimizer's ability to push down filters and optimize query
execution plans for queries with complex join conditions.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]