I can see the test for ParallelCollectionRDD.slice(). But how to explain the result of my test? The following is the simple code I used for test a=[] for i in range(0,10000): a.append(i) print sc.parallelize(a, 16).mapPartitions(lambda x: f(x)).collect() and the result is [0, 523776, 0, 1572352, 2620928, 0, 3669504, 4718080, 0, 5766656, 0, 6815232, 7863808, 0, 8912384, 7532280]
2013/9/26 Mike <sp...@good-with-numbers.com> > > It does in fact attempt to do that, and the tests check for that, but > > there's no guarantee in its API. Of course "equally" here means +/- > > one element. > > Correction: *except* when the Seq is a NumericRange: then it shorts the > last partition. E.g., 91 elements split 10 ways -> 10 elements in the > first 9 partitions, one in the last. > -- -- Shangyu, Luo Department of Computer Science Rice University -- Not Just Think About It, But Do It! -- Success is never final. -- Losers always whine about their best