sandugood commented on issue #1648: URL: https://github.com/apache/datafusion-ballista/issues/1648#issuecomment-4396595831
In Spark's AQE implementation there is a `DynamicJoinSelection`, that is defined in: [](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala) It is all about operating on the LogicalPlan and creating hints. The logic (forgive my Scala skills, I don't use it quite often) is that: - `apply` is being called and it checks if a strategy was pre-defined by user (i.e `broadcast() `used on rhs of a join) - then there are two `bool`s, deciding whether to demote BroadcastHashJoin or not: `manyEmptyInPlan`, `manyEmptyInOther` and `canBroadcastPlan` - then it is all about branching and creating hints: ``` if (demoteBroadcastHash && preferShuffleHash) { Some(SHUFFLE_HASH) } else if (demoteBroadcastHash) { Some(NO_BROADCAST_HASH) } else if (preferShuffleHash) { Some(PREFER_SHUFFLE_HASH) } else { None } ``` After hints are injected, it's on `JoinSelection` to select the proper physical plan: [](https://github.com/apache/spark/blob/c26a127ba33137f36d55bf95cac71471e2a1704f/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L181) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
