[jira] [Updated] (SPARK-32470) Remove task result size check for shuffle map stage

Wei Xue (Jira) Wed, 29 Jul 2020 09:59:33 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-32470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wei Xue updated SPARK-32470:
----------------------------
    Description: 
The task result of a shuffle map stage is not the query result but instead is 
only map status and metrics accumulator updates. Aside from the metrics that 
can vary in size, the total task result size solely depends on the number of 
tasks. And the number of tasks can get large regardless of the stage's output 
size. For example, the number of tasks generated by `CartesianProduct` is 
square of "spark.sql.shuffle.partitions", say if "spark.sql.shuffle.partitions" 
is set to 200, you get 40,000 tasks, if set to 500, you get 250,000 tasks, 
which can easily error on the default limit of `spark.driver.maxResultSize`:

 
{code:java}
org.apache.spark.SparkException: Job aborted due to stage failure: Total size 
of serialized results of 66496 tasks (4.0 GiB) is bigger than 
spark.driver.maxResultSize (4.0 GiB)
{code}
 

However, map status and accumulator updates are used by the driver to update 
the overall map stats and metrics of the query, and they are not cached on the 
driver, so they won't cause catastrophic memory issues on the driver. So we 
should remove this check for shuffle map stage tasks.

> Remove task result size check for shuffle map stage
> ---------------------------------------------------
>
>                 Key: SPARK-32470
>                 URL: https://issues.apache.org/jira/browse/SPARK-32470
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Wei Xue
>            Priority: Minor
>
> The task result of a shuffle map stage is not the query result but instead is 
> only map status and metrics accumulator updates. Aside from the metrics that 
> can vary in size, the total task result size solely depends on the number of 
> tasks. And the number of tasks can get large regardless of the stage's output 
> size. For example, the number of tasks generated by `CartesianProduct` is 
> square of "spark.sql.shuffle.partitions", say if 
> "spark.sql.shuffle.partitions" is set to 200, you get 40,000 tasks, if set to 
> 500, you get 250,000 tasks, which can easily error on the default limit of 
> `spark.driver.maxResultSize`:
>  
> {code:java}
> org.apache.spark.SparkException: Job aborted due to stage failure: Total size 
> of serialized results of 66496 tasks (4.0 GiB) is bigger than 
> spark.driver.maxResultSize (4.0 GiB)
> {code}
>  
> However, map status and accumulator updates are used by the driver to update 
> the overall map stats and metrics of the query, and they are not cached on 
> the driver, so they won't cause catastrophic memory issues on the driver. So 
> we should remove this check for shuffle map stage tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-32470) Remove task result size check for shuffle map stage

Reply via email to