Hello!
  I would like to use the logging mechanism provided by the log4j, but I'm
getting the
Exception in thread "main" org.apache.spark.SparkException: Task not
serializable -> Caused by: java.io.NotSerializableException:
org.apache.log4j.Logger

The code (and the problem) that I'm using resembles the one used here :
http://stackoverflow.com/questions/29208844/apache-spark-logging-within-scala,
meaning:

val log = Logger.getLogger(getClass.getName)

  def doTest() {
   val conf = new SparkConf().setMaster("local[4]").setAppName("LogTest")
   val spark = new SparkContext(conf)

   val someRdd = spark.parallelize(List(1, 2, 3))
   someRdd.map {
     element =>
       *log.info <http://log.info>(s"$element will be processed")*
       element + 1
    }
I'm posting the same problem due to the fact that the one from
stackoverflow didn't get any answer.
In this case, can you please tell us what is the best way to use  logging?
Is any solution that is not using the rdd.forEachPartition?

I look forward for your answers.
Regards,
Florin

Reply via email to