You can also use join function of rdd. This is actually kind of append
funtion that add up all the rdds and create one uber rdd.
On Wed, Jan 7, 2015, 14:30 rkgurram rkgur...@gmail.com wrote:
Thank you for the response, sure will try that out.
Currently I changed my code such that the first
Thank you for the response, sure will try that out.
Currently I changed my code such that the first map files.map to
files.flatMap, which I guess will do similar what you are saying, it gives
me a List[] of elements (in this case LabeledPoints, I could also do RDDs)
which I then turned into a
I think you mean union(). Yes, you could also simply make an RDD for each
file, and use SparkContext.union() to put them together.
On Wed, Jan 7, 2015 at 9:51 AM, Raghavendra Pandey
raghavendra.pan...@gmail.com wrote:
You can also use join function of rdd. This is actually kind of append
an RDD cannot contain elements of type RDD. (i.e. you can't nest RDDs within
RDDs, in fact, I don't think it makes any sense)
I suggest rather than having an RDD of file names, collect those file name
strings back on to the driver as a Scala array of file names, and then from
there, make an array