Hi, my programming model requires me to generate multiple RDDs for various datasets across a single run and then run an action on it - E.g.
MyFunc myFunc = ... //It implements VoidFunction //set some extra variables - all serializable ... for (JavaRDD<String> rdd: rddList) { ... sc.foreach(myFunc); } The problem I'm seeing is that after the first run of the loop - which succeeds on foreach, the second one fails with java.io.NotSerializableException for a specific object I'm setting. In my particular case, the object contains a reference to org.apache.hadoop.conf.Configuration. Question is: 1. Why does this succeed the first time, and fail the second? 2. Any alternatives to this programming model? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-an-action-inside-a-loop-across-multiple-RDDs-java-io-NotSerializableException-tp16580.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org