Hi, I wonder if someone can help suggest a solution to my problem, I had a simple process working using Strings and now want to convert to RDD[Char], the problem is when I end up with a nested call as follow:
1) Load a text file into an RDD[Char] val inputRDD = sc.textFile(“myFile.txt”).flatMap(_.toIterator) 2) I have a method that takes two parameters: object Foo { def myFunction(inputRDD: RDD[Char], int val) : RDD[Char] ... 3) I have a method that the driver process calls once its loaded the inputRDD ‘bar’ as follows: def bar(inputRDD: Rdd[Char) : Int = { val solutionSet = sc.parallelize(1 to alphabetLength toList).map(shift => (shift, Object.myFunction(inputRDD,shift))) What I’m trying to do is take a list 1..26 and generate a set of tuples { (1,RDD(1)), …. (26,RDD(26)) } which is the inputRDD passed through the function above, but with different set of shift parameters. In my original I could parallelise the algorithm fine, but my input string had to be in a ‘String’ variable, I’d rather it be an RDD (string could be large). I think the way I’m trying to do it above won’t work because its a nested RDD call. Can anybody suggest a solution? Regards, Mike Lewis --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org