No, there's no such thing as an RDD of RDDs in Spark. Here though, why not just operate on an RDD of Lists? or a List of RDDs? Usually one of these two is the right approach whenever you feel inclined to operate on an RDD of RDDs.
On Wed, Oct 22, 2014 at 3:58 PM, Tomer Benyamini <tomer....@gmail.com> wrote: > Hello, > > I would like to parallelize my work on multiple RDDs I have. I wanted > to know if spark can support a "foreach" on an RDD of RDDs. Here's a > java example: > > public static void main(String[] args) { > > SparkConf sparkConf = new SparkConf().setAppName("testapp"); > sparkConf.setMaster("local"); > > JavaSparkContext sc = new JavaSparkContext(sparkConf); > > > List<String> list = Arrays.asList(new String[] {"1", "2", "3"}); > JavaRDD<String> rdd = sc.parallelize(list); > > List<String> list1 = Arrays.asList(new String[] {"a", "b", "c"}); > JavaRDD<String> rdd1 = sc.parallelize(list1); > > List<JavaRDD<String>> rddList = new ArrayList<JavaRDD<String>>(); > rddList.add(rdd); > rddList.add(rdd1); > > > JavaRDD<JavaRDD<String>> rddOfRdds = sc.parallelize(rddList); > System.out.println(rddOfRdds.count()); > > > rddOfRdds.foreach(new VoidFunction<JavaRDD<String>>() { > > @Override > public void call(JavaRDD<String> t) throws Exception { > System.out.println(t.count()); > } > > }); > } > > From this code I'm getting a NullPointerException on the internal count > method: > > Exception in thread "main" org.apache.spark.SparkException: Job > aborted due to stage failure: Task 1.0:0 failed 1 times, most recent > failure: Exception failure in TID 1 on host localhost: > java.lang.NullPointerException > > org.apache.spark.rdd.RDD.count(RDD.scala:861) > > > org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:365) > > org.apache.spark.api.java.JavaRDD.count(JavaRDD.scala:29) > > Help will be appreciated. > > Thanks, > Tomer > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org