Hi Guys,

I'm having an issue in standalone mode (Spark 1.1, Hadoop 2.4, Windows Server 
2008).

A very simple program runs fine in local mode but fails in standalone mode.

Here is the error:

14/11/20 17:01:53 INFO DAGScheduler: Failed to run count at SimpleApp.scala:22
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to 
stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost 
task
0.3 in stage 0.0 (TID 6, UK-RND-PN02.actixhost.eu): 
java.lang.ClassNotFoundException: SimpleApp$$anonfun$1
        java.net.URLClassLoader$1.run(URLClassLoader.java:202)

I have added the jar to the SparkConf() to be on the safe side and it appears 
in standard output (copied after the code):

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

import java.net.URLClassLoader

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "S:\\spark-1.1.0-bin-hadoop2.4\\README.md"
    val conf = new 
SparkConf()//.setJars(Seq("s:\\spark\\simple\\target\\scala-2.10\\simple-project_2.10-1.0.jar"))
     .setMaster("spark://UK-RND-PN02.actixhost.eu:7077")
     //.setMaster("local[4]")
     .setAppName("Simple Application")
    val sc = new SparkContext(conf)

    val cl = ClassLoader.getSystemClassLoader
    val urls = cl.asInstanceOf[URLClassLoader].getURLs
    urls.foreach(url => println("Executor classpath is:" + url.getFile))

    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
    sc.stop()
  }
}

Simple-project is in the executor classpath list:
14/11/20 17:01:48 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready 
for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Executor classpath is:/S:/spark/simple/
Executor classpath 
is:/S:/spark/simple/target/scala-2.10/simple-project_2.10-1.0.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/conf/
Executor classpath 
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
Executor classpath is:/S:/spark/simple/
Executor classpath 
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.1.jar
Executor classpath 
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-core-3.2.2.jar
Executor classpath 
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.1.jar
Executor classpath is:/S:/spark/simple/

Would you have any idea how I could investigate further ?

Thanks !
Benoit.


PS: I could attach a debugger to the Worker where the ClassNotFoundException 
happens but it is a bit painful

This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp

Reply via email to