Tornike Gurgenidze created SPARK-53618:
------------------------------------------
Summary: Turn a FULL JOIN with false condition to a UNION ALL
Key: SPARK-53618
URL: https://issues.apache.org/jira/browse/SPARK-53618
Project: Spark
Issue Type: Improvement
Components: Optimizer
Affects Versions: 4.0.1
Reporter: Tornike Gurgenidze
Catalyst should translate full joins that have static always false conditions
(like 1 = 0) to a simple union of tables padded with nulls where necessary.
Example join looks like this:
{code:sql}
SELECT *
FROM A
FULL JOIN B ON 1 = 0
{code}
Instead of turning the query to it's simpler format, right now spark executes a
BroadcastNestedLoop join as 1 = 0 is not an equi join condition.
While this is often trivial enough that the users can do it themselves w/o
involving an optimizer, there's a corner case of 1 = 0 condition being used in
a MERGE statement that can't really be optimized by a user by rewriting a query.
This issue is a follow-up on an another
[PR|https://github.com/apache/spark/pull/52185] of mine that paves the way for
a query like this to be used for a replaceWhere-like operation from delta lake
which is not available in iceberg:
{code:sql}
MERGE INTO target_table as t
USING source as s
ON 1 = 0
WHEN NOT MATCHED THEN INSERT *
WHEN NOT MATCHED BY SOURCE AND [replaceWhereCondition] THEN DELETE
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]