Ah... I think you're right about the flatMap then :). Or you could use
mapPartitions. (I'm not sure if it makes a difference.)
On Mon, Dec 8, 2014 at 10:09 PM, Steve Lewis lordjoe2...@gmail.com wrote:
looks good but how do I say that in Java
as far as I can see sc.parallelize (in Java) has
Hi,
I think you have the right idea. I would not even worry about flatMap.
val rdd = sc.parallelize(1 to 100, numSlices = 1000).map(x =
generateRandomObject(x))
Then when you try to evaluate something on this RDD, it will happen
partition-by-partition. So 1000 random objects will be
looks good but how do I say that in Java
as far as I can see sc.parallelize (in Java) has only one implementation
which takes a List - requiring an in memory representation
On Mon, Dec 8, 2014 at 12:06 PM, Daniel Darabos
daniel.dara...@lynxanalytics.com wrote:
Hi,
I think you have the right