The fix for this make your class Serializable. The reason being the
closures you have defined in the class need to be serialized and copied
over to all executor nodes.

Hope this helps.

Thanks
Ankur

On Mon, Mar 6, 2017 at 1:06 PM, Mina Aslani <aslanim...@gmail.com> wrote:

> Hi,
>
> I am trying to start with spark and get number of lines of a text file in my 
> mac, however I get
>
> org.apache.spark.SparkException: Task not serializable error on
>
> JavaRDD<String> logData = javaCtx.textFile(file);
>
> Please see below for the sample of code and the stackTrace.
>
> Any idea why this error is thrown?
>
> Best regards,
>
> Mina
>
> System.out.println("Creating Spark Configuration");
> SparkConf javaConf = new SparkConf();
> javaConf.setAppName("My First Spark Java Application");
> javaConf.setMaster("PATH to my spark");
> System.out.println("Creating Spark Context");
> JavaSparkContext javaCtx = new JavaSparkContext(javaConf);
> System.out.println("Loading the Dataset and will further process it");
> String file = "file:///file.txt";
> JavaRDD<String> logData = javaCtx.textFile(file);
>
> long numLines = logData.filter(new Function<String, Boolean>() {
>    public Boolean call(String s) {
>       return true;
>    }
> }).count();
>
> System.out.println("Number of Lines in the Dataset "+numLines);
>
> javaCtx.close();
>
> Exception in thread "main" org.apache.spark.SparkException: Task not 
> serializable
>       at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
>       at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
>       at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
>       at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
>       at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:387)
>       at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:386)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>       at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>       at org.apache.spark.rdd.RDD.filter(RDD.scala:386)
>       at org.apache.spark.api.java.JavaRDD.filter(JavaRDD.scala:78)
>
>

Reply via email to