Hi, I am trying to start with spark and get number of lines of a text file in my mac, however I get
org.apache.spark.SparkException: Task not serializable error on JavaRDD<String> logData = javaCtx.textFile(file); Please see below for the sample of code and the stackTrace. Any idea why this error is thrown? Best regards, Mina System.out.println("Creating Spark Configuration"); SparkConf javaConf = new SparkConf(); javaConf.setAppName("My First Spark Java Application"); javaConf.setMaster("PATH to my spark"); System.out.println("Creating Spark Context"); JavaSparkContext javaCtx = new JavaSparkContext(javaConf); System.out.println("Loading the Dataset and will further process it"); String file = "file:///file.txt"; JavaRDD<String> logData = javaCtx.textFile(file); long numLines = logData.filter(new Function<String, Boolean>() { public Boolean call(String s) { return true; } }).count(); System.out.println("Number of Lines in the Dataset "+numLines); javaCtx.close(); Exception in thread "main" org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) at org.apache.spark.SparkContext.clean(SparkContext.scala:2094) at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:387) at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:386) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) at org.apache.spark.rdd.RDD.filter(RDD.scala:386) at org.apache.spark.api.java.JavaRDD.filter(JavaRDD.scala:78)