Hi,

I have the following code structure. I compiles ok, but at runtime it aborts
with the error:
Exception in thread "main" org.apache.spark.SparkException: Job aborted:
Task not serializable: java.io.NotSerializableException: 
I am running in local (standalone) mode.

trait A{
  def input(...): ...
  def output(...)
  def computeSim(...): ... { }
}

class TA extends A{
   override input(...): {...}
   override output(...): {...}
}

object TA{
    def main(...) {

     val c  = new TA
     val r  = c.input()
     val s = c.computeSim(r)
     c.output(s) 

   }
}

When I have all of the code in a single object, it runs and outputs the
correct result. But this error occurs only when I have the class and trait,
which I want to use to make it more modular. 

The error appears to be happening in the output() method. The  
transformations I am using in the output method, in the order in which they
appear  are: 

.....map().collect().filter().sortBy().take()

It appears that both collect() and take() are not serializable (even though
I am running the code in local mode). If I drop collect(), there is a
compile error when I use sortBy. I need both collect() and take(). I am not
sure why these transformations work when I use a single object, but fail
when I use a class.
 
I would appreciate your help in helping fix this.

thanks





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Task-not-serializable-collect-take-tp5193.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to