[ https://issues.apache.org/jira/browse/SPARK-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matei Zaharia updated SPARK-1817: --------------------------------- Priority: Major (was: Minor) > RDD zip erroneous when partitions do not divide RDD count > --------------------------------------------------------- > > Key: SPARK-1817 > URL: https://issues.apache.org/jira/browse/SPARK-1817 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 0.9.0, 1.0.0 > Reporter: Michael Malak > Assignee: Kan Zhang > > Example: > scala> sc.parallelize(1L to 2L,4).zip(sc.parallelize(11 to 12,4)).collect > res1: Array[(Long, Int)] = Array((2,11)) > But more generally, it's whenever the number of partitions does not evenly > divide the total number of elements in the RDD. > See https://groups.google.com/forum/#!msg/spark-users/demrmjHFnoc/Ek3ijiXHr2MJ -- This message was sent by Atlassian JIRA (v6.2#6252)