[ https://issues.apache.org/jira/browse/SPARK-32470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-32470: ---------------------------------- Affects Version/s: 2.4.6 3.0.0 > Remove task result size check for shuffle map stage > --------------------------------------------------- > > Key: SPARK-32470 > URL: https://issues.apache.org/jira/browse/SPARK-32470 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.4.6, 3.0.0, 3.1.0 > Reporter: Wei Xue > Priority: Minor > > The task result of a shuffle map stage is not the query result but instead is > only map status and metrics accumulator updates. Aside from the metrics that > can vary in size, the total task result size solely depends on the number of > tasks. And the number of tasks can get large regardless of the stage's output > size. For example, the number of tasks generated by `CartesianProduct` is > square of "spark.sql.shuffle.partitions", say if > "spark.sql.shuffle.partitions" is set to 200, you get 40,000 tasks, if set to > 500, you get 250,000 tasks, which can easily error on the default limit of > `spark.driver.maxResultSize`: > > {code:java} > org.apache.spark.SparkException: Job aborted due to stage failure: Total size > of serialized results of 66496 tasks (4.0 GiB) is bigger than > spark.driver.maxResultSize (4.0 GiB) > {code} > > However, map status and accumulator updates are used by the driver to update > the overall map stats and metrics of the query, and they are not cached on > the driver, so they won't cause catastrophic memory issues on the driver. So > we should remove this check for shuffle map stage tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org