agrawaldevesh commented on a change in pull request #29342: URL: https://github.com/apache/spark/pull/29342#discussion_r466738015
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala ########## @@ -71,8 +89,134 @@ case class ShuffledHashJoinExec( val numOutputRows = longMetric("numOutputRows") streamedPlan.execute().zipPartitions(buildPlan.execute()) { (streamIter, buildIter) => val hashed = buildHashedRelation(buildIter) - join(streamIter, hashed, numOutputRows) + joinType match { + case FullOuter => fullOuterJoin(streamIter, hashed, numOutputRows) Review comment: I am not sure if that's a good idea: spark.sql.autoBroadcastJoinThreshold is a very widely used config and I think we should have a separate config to disable just this full outer join optimization, without having to turn of BHJ in itself. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org