Something like this? (2 to alphabetLength toList).map(shift => Object.myFunction(inputRDD, shift).map(v => shift -> v).foldLeft(Object.myFunction(inputRDD, 1).map(v => 1 -> v))(_ union _)
which is an RDD[(Int, Char)] Problem is that you can't play with RDDs inside of RDDs. The recursive structure breaks the Spark programming model. On Sat, Mar 21, 2015 at 10:26 AM, Michael Lewis <lewi...@me.com> wrote: > Hi, > > I wonder if someone can help suggest a solution to my problem, I had a > simple process working using Strings and now > want to convert to RDD[Char], the problem is when I end up with a nested > call as follow: > > > 1) Load a text file into an RDD[Char] > > val inputRDD = sc.textFile(“myFile.txt”).flatMap(_.toIterator) > > > 2) I have a method that takes two parameters: > > object Foo > { > def myFunction(inputRDD: RDD[Char], int val) : RDD[Char] > ... > > > 3) I have a method that the driver process calls once its loaded the > inputRDD ‘bar’ as follows: > > def bar(inputRDD: Rdd[Char) : Int = { > > val solutionSet = sc.parallelize(1 to alphabetLength > toList).map(shift => (shift, Object.myFunction(inputRDD,shift))) > > > > What I’m trying to do is take a list 1..26 and generate a set of tuples { > (1,RDD(1)), …. (26,RDD(26)) } which is the inputRDD passed through > the function above, but with different set of shift parameters. > > In my original I could parallelise the algorithm fine, but my input string > had to be in a ‘String’ variable, I’d rather it be an RDD > (string could be large). I think the way I’m trying to do it above won’t > work because its a nested RDD call. > > Can anybody suggest a solution? > > Regards, > Mike Lewis > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >