works fine. Spark 1.1.0 on REPL On Sat, Oct 25, 2014 at 1:41 PM, octavian.ganea <octavian.ga...@inf.ethz.ch> wrote:
> There is for sure a bug in the Accumulators code. > > More specifically, the following code works well as expected: > > def main(args: Array[String]) { > val conf = new SparkConf().setAppName("EL LBP SPARK") > val sc = new SparkContext(conf) > val accum = sc.accumulator(0) > sc.parallelize(Array(1, 2, 3, 4)).foreach(x => accum += x) > sc.stop > } > > but the following code (adding just a for loop) gives the weird error : > def run(args: Array[String]) { > val conf = new SparkConf().setAppName("EL LBP SPARK") > val sc = new SparkContext(conf) > val accum = sc.accumulator(0) > for (i <- 1 to 10) { > sc.parallelize(Array(1, 2, 3, 4)).foreach(x => accum += x) > } > sc.stop > } > > > the error: > Exception in thread "main" org.apache.spark.SparkException: Job aborted due > to stage failure: Task not serializable: java.io.NotSerializableException: > org.apache.spark.SparkContext > at > org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) > at > > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) > at > > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) > at > > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) > at > org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:770) > at > org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:713) > at > > org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:697) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1176) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at > > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > Can someone confirm this bug ? > > Related to this: > > http://apache-spark-user-list.1001560.n3.nabble.com/Accumulators-Task-not-serializable-java-io-NotSerializableException-org-apache-spark-SparkContext-td17262.html > > > http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-when-using-Accumulators-on-cluster-td17261.html > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Bug-in-Accumulators-tp17263.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >