Re: How to merge a RDD of RDDs into one uber RDD

2015-01-07 Thread Raghavendra Pandey
You can also use join function of rdd. This is actually kind of append funtion that add up all the rdds and create one uber rdd. On Wed, Jan 7, 2015, 14:30 rkgurram rkgur...@gmail.com wrote: Thank you for the response, sure will try that out. Currently I changed my code such that the first

Re: How to merge a RDD of RDDs into one uber RDD

2015-01-07 Thread rkgurram
Thank you for the response, sure will try that out. Currently I changed my code such that the first map files.map to files.flatMap, which I guess will do similar what you are saying, it gives me a List[] of elements (in this case LabeledPoints, I could also do RDDs) which I then turned into a

Re: How to merge a RDD of RDDs into one uber RDD

2015-01-07 Thread Sean Owen
I think you mean union(). Yes, you could also simply make an RDD for each file, and use SparkContext.union() to put them together. On Wed, Jan 7, 2015 at 9:51 AM, Raghavendra Pandey raghavendra.pan...@gmail.com wrote: You can also use join function of rdd. This is actually kind of append

Re: How to merge a RDD of RDDs into one uber RDD

2015-01-06 Thread k.tham
an RDD cannot contain elements of type RDD. (i.e. you can't nest RDDs within RDDs, in fact, I don't think it makes any sense) I suggest rather than having an RDD of file names, collect those file name strings back on to the driver as a Scala array of file names, and then from there, make an array