Redocpot, I tried your 2 snippets with spark-shell and both work fine. I only
see problem if closure is not serializeable.

scala> val rdd1 = sc.parallelize(List(1, 2, 3, 4)) 
rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[4] at
parallelize at <console>:12

scala> val a = 1   
a: Int = 1

scala> val rdd2 = rdd1.map(_ + a) 
rdd2: org.apache.spark.rdd.RDD[Int] = MappedRDD[5] at map at <console>:16

scala> rdd2.count
13/12/30 03:50:59 INFO SparkContext: Starting job: count at <console>:19
13/12/30 03:50:59 INFO DAGScheduler: Got job 4 (count at <console>:19) with
2 output partitions (allowLocal=false)
13/12/30 03:50:59 INFO DAGScheduler: Final stage: Stage 4 (count at
<console>:19)
13/12/30 03:50:59 INFO DAGScheduler: Parents of final stage: List()
13/12/30 03:50:59 INFO DAGScheduler: Missing parents: List()
13/12/30 03:50:59 INFO DAGScheduler: Submitting Stage 4 (MappedRDD[5] at map
at <console>:16), which has no missing parents
13/12/30 03:50:59 INFO DAGScheduler: Submitting 2 missing tasks from Stage 4
(MappedRDD[5] at map at <console>:16)
13/12/30 03:50:59 INFO ClusterScheduler: Adding task set 4.0 with 2 tasks
13/12/30 03:50:59 INFO ClusterTaskSetManager: Starting task 4.0:0 as TID 8
on executor 0: worker1 (PROCESS_LOCAL)
13/12/30 03:50:59 INFO ClusterTaskSetManager: Serialized task 4.0:0 as 1839
bytes in 1 ms
13/12/30 03:50:59 INFO ClusterTaskSetManager: Starting task 4.0:1 as TID 9
on executor 1: worker2 (PROCESS_LOCAL)
13/12/30 03:50:59 INFO ClusterTaskSetManager: Serialized task 4.0:1 as 1839
bytes in 1 ms
13/12/30 03:51:00 INFO ClusterTaskSetManager: Finished TID 8 in 152 ms on
worker1 (progress: 1/2)
13/12/30 03:51:00 INFO DAGScheduler: Completed ResultTask(4, 0)
13/12/30 03:51:00 INFO ClusterTaskSetManager: Finished TID 9 in 171 ms on
worker2 (progress: 2/2)
13/12/30 03:51:00 INFO ClusterScheduler: Remove TaskSet 4.0 from pool 
13/12/30 03:51:00 INFO DAGScheduler: Completed ResultTask(4, 1)
13/12/30 03:51:00 INFO DAGScheduler: Stage 4 (count at <console>:19)
finished in 0.131 s
13/12/30 03:51:00 INFO SparkContext: Job finished: count at <console>:19,
took 0.212351498 s
res5: Long = 4




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/closure-and-ExceptionInInitializerError-tp77p98.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to