[jira] [Resolved] (SPARK-4019) Shuffling with more than 2000 reducers may drop all data when partitons are mostly empty or cause deserialization errors if at least one partition is empty

Patrick Wendell (JIRA) Thu, 23 Oct 2014 16:41:33 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Patrick Wendell resolved SPARK-4019.
------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.2.0

Fixed by Josh's patch:
https://github.com/apache/spark/pull/2866

> Shuffling with more than 2000 reducers may drop all data when partitons are 
> mostly empty or cause deserialization errors if at least one partition is 
> empty
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4019
>                 URL: https://issues.apache.org/jira/browse/SPARK-4019
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Xiangrui Meng
>            Assignee: Josh Rosen
>            Priority: Blocker
>             Fix For: 1.2.0
>
>
> {code}
> sc.makeRDD(0 until 10, 1000).repartition(2001).collect()
> {code}
> returns `Array()`.
> 1.1.0 doesn't have this issue. Tried both HASH and SORT manager.
> This problem can also manifest itself as Snappy deserialization errors if the 
> average map output status size is non-zero but there is at least one empty 
> partition, e.g. 
> sc.makeRDD(0 until 100000, 1000).repartition(2001).collect()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-4019) Shuffling with more than 2000 reducers may drop all data when partitons are mostly empty or cause deserialization errors if at least one partition is empty

Reply via email to