mahesh kumar behera created SPARK-44307:
-------------------------------------------

             Summary: Bloom filter is not added for left outer join if the left 
side table is smaller than broadcast threshold.
                 Key: SPARK-44307
                 URL: https://issues.apache.org/jira/browse/SPARK-44307
             Project: Spark
          Issue Type: Bug
          Components: Optimizer
    Affects Versions: 3.4.1
            Reporter: mahesh kumar behera
             Fix For: 3.5.0


In case of left outer join, even if the left side table is small enough to be 
broadcasted, shuffle join is used. This is because of the property of the left 
outer join. If the left side is broadcasted in left outer join, the result 
generated will be wrong. But this is not taken care of in bloom filter. While 
injecting the bloom filter, if lest side is smaller than broadcast threshold, 
bloom filter is not added. It assumes that the left side will be broadcast and 
there is no need for a bloom filter. This causes bloom filter optimization to 
be missed in case of left outer join with small left side and huge right-side 
table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to