[ https://issues.apache.org/jira/browse/SPARK-42776 ]
Timothy Miller deleted comment on SPARK-42776: ---------------------------------------- was (Author: JIRAUSER287471): A little more detail about the sequence events that cause this bug: * org.apache.spark.sql.execution.RemoveRedundantProjects is applied * that causes BroadcastHashJoinExec to get created * org.apache.spark.sql.execution.exchange.EnsureRequirements is applied * BroadcastHashJoinExec.requiredChildDistribution gets called, creating the hashmap object that gets broadcast * a few more rules are applied, followed by org.apache.spark.sql.execution.ApplyColumnarRulesAndInsertTransitions * Only after that can I replace BroadcastHashJoinExec with a columnar alternative, but by then it's too late. I can't find a way to inject extra rules into or between RemoveRedundantProjects or EnsureRequirements, so there doesn't seem to be a workaround either. > BroadcastHashJoinExec.requiredChildDistribution called before columnar > replacement rules > ---------------------------------------------------------------------------------------- > > Key: SPARK-42776 > URL: https://issues.apache.org/jira/browse/SPARK-42776 > Project: Spark > Issue Type: Bug > Components: Optimizer > Affects Versions: 3.3.1 > Environment: I'm prototyping on a Mac, but that's not really relevant. > Reporter: Timothy Miller > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org