[ https://issues.apache.org/jira/browse/SPARK-35264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340026#comment-17340026 ]
Dongjoon Hyun commented on SPARK-35264: --------------------------------------- Thank YOU, [~ulysses]! :) > Support AQE side broadcastJoin threshold > ---------------------------------------- > > Key: SPARK-35264 > URL: https://issues.apache.org/jira/browse/SPARK-35264 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.2.0 > Reporter: ulysses you > Assignee: ulysses you > Priority: Major > Fix For: 3.2.0 > > > The main idea here is that make join config isolation between normal planner > and aqe planner which shared the same code path. > Actually we don not very trust using the static stat to consider if it can > build broadcast hash join. In our experience it's very common that Spark > throw broadcast timeout or driver side OOM exception when execute a bit large > plan. And due to braodcast join is not reversed which means if we covert join > to braodcast hash join at first time, we(AQE) can not optimize it again, so > it should make sense to decide if we can do broadcast at aqe side using > different sql config. > In order to achieve this we use a specific join hint in advance during AQE > framework and then at JoinSelection side it will take and follow the inserted > hint. > For now we only support select strategy for equi join, and follow this order > 1. mark join as broadcast hash join if possible > 2. mark join as shuffled hash join if possible > Note that, we don't override join strategy if user specifies a join hint. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org