In Java how can I create an RDD with a large number of elements

2014-12-08 Thread Steve Lewis
assume I don't care about values which may be created in a later map - in scala I can say val rdd = sc.parallelize(1 to 10, numSlices = 1000) but in Java JavaSparkContext can only paralellize a List - limited to Integer,MAX_VALUE elements and required to exist in memory - the best I can

Re: In Java how can I create an RDD with a large number of elements

2014-12-08 Thread praveen seluka
Steve, Something like this will do I think = sc.parallelize(1 to 1000, 1000).flatMap(x = 1 to 10) the above will launch 1000 tasks (maps), with each task creating 10^5 numbers (total of 100 million elements) On Mon, Dec 8, 2014 at 6:17 PM, Steve Lewis lordjoe2...@gmail.com wrote: assume