[ 
https://issues.apache.org/jira/browse/SPARK-50257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-50257:
----------------------------------
    Affects Version/s: 4.2.0
                           (was: 4.0.0)

> [Core]If a stage contains ExpandExec,  the CoalesceShufflePartitions rule 
> will not be adjusted during the AQE phase
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-50257
>                 URL: https://issues.apache.org/jira/browse/SPARK-50257
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 4.2.0
>            Reporter: guihuawen
>            Priority: Major
>         Attachments: 截屏2024-11-07 13.52.45.png
>
>
> 【sql】
> {code:java}
> // code placeholder
> SELECT
>        /*+ SHUFFLE_MERGE(b) */
>        s_date,
>        sum(s_quantity * i_price) AS total_sales
>    FROM
>        sales a
>        JOIN items b ON s_item_id = i_item_id
>    WHERE
>        i_price < 10
>    GROUP BY
>        s_date with rollup;
>  {code}
> Set spark.sql.shuffle.partitions=1000
> After aqe:
> !截屏2024-11-07 13.52.45.png|width=444,height=431!
> The parallel reads in the ExpandExecut phase have been adjusted to 71, 
> reducing parallelism. The ExpandExecut phase can lead to data expansion, and 
> a decrease in parallelism can result in longer task execution times.
> If AGE is turned off as a whole, AQE optimization cannot be enjoyed in other 
> stages. If it is found that ExpandExec is included in the current stage, 
> partition merging will not be performed for this issue.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to