[ https://issues.apache.org/jira/browse/SPARK-33832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated SPARK-33832: ----------------------------------- Description: Currently {{OptimizeSkewedJoin}} will not apply if skew mitigation causes a new shuffle. There are situations where it's better to mitigate skew even if it means a new shuffle is added, for example if the join outputs small amount of data. As a first step I propose adding a SQLConf option to enable this. I'll open a PR shortly to get feedback on the approach. was: Currently {{OptimizeSkewdJoin}} will not apply if skew mitigation causes a new shuffle. There are situations where it's better to mitigate skew even if it means a new shuffle is added, for example if the join outputs small amount of data. As a first step I propose adding a SQLConf option to enable this. I'll open a PR shortly to get feedback on the approach. > Add an option in AQE to mitigate skew even if it causes an new shuffle > ---------------------------------------------------------------------- > > Key: SPARK-33832 > URL: https://issues.apache.org/jira/browse/SPARK-33832 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Eugene Koifman > Priority: Major > > Currently {{OptimizeSkewedJoin}} will not apply if skew mitigation causes a > new shuffle. > There are situations where it's better to mitigate skew even if it means a > new shuffle is added, for example if the join outputs small amount of data. > As a first step I propose adding a SQLConf option to enable this. > I'll open a PR shortly to get feedback on the approach. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org