Hi, everybody!

I'm trying to deploy a simple app in Spark standalone cluster with a single
node (the localhost).
Unfortunately, something goes wrong while processing the JAR file and an
exception NullPointerException is thrown.
I'm running everything in a single machine with Windows8.
Check below the detail.
Please help with suggestions what is missing to make it work - really
looking forward to work with spark in a cluster.
The problem shows up both with my own little programs and with the spark
examples (e.g. WordCount).
The problem also show both running with my custom driver or using the
spark-submit or run-examples (which calls spark-submit).


(Hadoop I also compiled from source for windows - but not really being
used.)


Drive Code:

    SparkConf conf = new SparkConf().setAppName("SimpleTests")
                .setJars(new
String[]{"file:///myworkspace/spark-tests.jar"})
                .setMaster("spark://mymachine:7077")
                .setSparkHome("/mysparkhome/spark-1.1.0-bin-hadoop2.4");
  JavaSparkContext sc = new JavaSparkContext(conf);

Streaming coding is trivial and the usual:

Get this output and error:

Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/Users/JorgePaulo/tmp/hadoop/hadoop-2.4.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/Users/JorgePaulo/tmp/spark/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/10/12 20:15:00 INFO SecurityManager: Changing view acls to: JorgePaulo,
14/10/12 20:15:00 INFO SecurityManager: Changing modify acls to: JorgePaulo,
14/10/12 20:15:00 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(JorgePaulo, );
users with modify permissions: Set(JorgePaulo, )
14/10/12 20:15:01 INFO Slf4jLogger: Slf4jLogger started
14/10/12 20:15:02 INFO Remoting: Starting remoting
14/10/12 20:15:02 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkDriver@jsimao71-acer:4279]
14/10/12 20:15:02 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sparkDriver@jsimao71-acer:4279]
14/10/12 20:15:02 INFO Utils: Successfully started service 'sparkDriver' on
port 4279.
14/10/12 20:15:02 INFO SparkEnv: Registering MapOutputTracker
14/10/12 20:15:02 INFO SparkEnv: Registering BlockManagerMaster
14/10/12 20:15:02 INFO DiskBlockManager: Created local directory at
C:\Users\JORGEP~1\AppData\Local\Temp\spark-local-20141012201502-723f
14/10/12 20:15:02 INFO Utils: Successfully started service 'Connection
manager for block manager' on port 4282.
14/10/12 20:15:02 INFO ConnectionManager: Bound socket to port 4282 with id
= ConnectionManagerId(jsimao71-acer,4282)
14/10/12 20:15:02 INFO MemoryStore: MemoryStore started with capacity 669.3
MB
14/10/12 20:15:02 INFO BlockManagerMaster: Trying to register BlockManager
14/10/12 20:15:02 INFO BlockManagerMasterActor: Registering block manager
jsimao71-acer:4282 with 669.3 MB RAM
14/10/12 20:15:02 INFO BlockManagerMaster: Registered BlockManager
14/10/12 20:15:02 INFO HttpFileServer: HTTP File server directory is
C:\Users\JORGEP~1\AppData\Local\Temp\spark-4771bfb8-e4f4-43d2-a437-6d55ee7c88b4
14/10/12 20:15:02 INFO HttpServer: Starting HTTP Server
14/10/12 20:15:03 INFO Utils: Successfully started service 'HTTP file
server' on port 4283.
14/10/12 20:15:03 INFO Utils: Successfully started service 'SparkUI' on
port 4040.
14/10/12 20:15:03 INFO SparkUI: Started SparkUI at http://jsimao71-acer:4040
14/10/12 20:15:10 INFO SparkContext: Added JAR
file:///Users/JorgePaulo/workspace/spark-tests.jar at
http://192.168.179.1:4283/jars/spark-tests.jar with timestamp 1413141310617
14/10/12 20:15:10 INFO AppClient$ClientActor: Connecting to master
spark://jsimao71-acer:7077...
14/10/12 20:15:10 INFO SparkDeploySchedulerBackend: SchedulerBackend is
ready for scheduling beginning after reached minRegisteredResourcesRatio:
0.0
14/10/12 20:15:11 INFO MemoryStore: ensureFreeSpace(159118) called with
curMem=0, maxMem=701843374
14/10/12 20:15:11 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 155.4 KB, free 669.2 MB)
14/10/12 20:15:11 INFO SparkDeploySchedulerBackend: Connected to Spark
cluster with app ID app-20141012201511-0014
14/10/12 20:15:11 INFO AppClient$ClientActor: Executor added:
app-20141012201511-0014/0 on worker-20141012171633-jsimao71-acer-1970
(jsimao71-acer:1970) with 4 cores
14/10/12 20:15:11 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20141012201511-0014/0 on hostPort jsimao71-acer:1970 with 4 cores,
512.0 MB RAM
14/10/12 20:15:11 INFO AppClient$ClientActor: Executor updated:
app-20141012201511-0014/0 is now RUNNING
14/10/12 20:15:11 INFO MemoryStore: ensureFreeSpace(12633) called with
curMem=159118, maxMem=701843374
14/10/12 20:15:11 INFO MemoryStore: Block broadcast_0_piece0 stored as
bytes in memory (estimated size 12.3 KB, free 669.2 MB)
14/10/12 20:15:11 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
on jsimao71-acer:4282 (size: 12.3 KB, free: 669.3 MB)
14/10/12 20:15:11 INFO BlockManagerMaster: Updated info of block
broadcast_0_piece0
14/10/12 20:15:12 INFO FileInputFormat: Total input paths to process : 1
14/10/12 20:15:12 INFO SparkContext: Starting job: count at
SparkTests.java:48
14/10/12 20:15:12 INFO DAGScheduler: Got job 0 (count at
SparkTests.java:48) with 2 output partitions (allowLocal=false)
14/10/12 20:15:12 INFO DAGScheduler: Final stage: Stage 0(count at
SparkTests.java:48)
14/10/12 20:15:12 INFO DAGScheduler: Parents of final stage: List()
14/10/12 20:15:12 INFO DAGScheduler: Missing parents: List()
14/10/12 20:15:12 INFO DAGScheduler: Submitting Stage 0 (FilteredRDD[2] at
filter at SparkTests.java:42), which has no missing parents
14/10/12 20:15:12 INFO MemoryStore: ensureFreeSpace(2944) called with
curMem=171751, maxMem=701843374
14/10/12 20:15:12 INFO MemoryStore: Block broadcast_1 stored as values in
memory (estimated size 2.9 KB, free 669.2 MB)
14/10/12 20:15:12 INFO MemoryStore: ensureFreeSpace(1877) called with
curMem=174695, maxMem=701843374
14/10/12 20:15:12 INFO MemoryStore: Block broadcast_1_piece0 stored as
bytes in memory (estimated size 1877.0 B, free 669.2 MB)
14/10/12 20:15:12 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory
on jsimao71-acer:4282 (size: 1877.0 B, free: 669.3 MB)
14/10/12 20:15:12 INFO BlockManagerMaster: Updated info of block
broadcast_1_piece0
14/10/12 20:15:12 INFO DAGScheduler: Submitting 2 missing tasks from Stage
0 (FilteredRDD[2] at filter at SparkTests.java:42)
14/10/12 20:15:12 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
14/10/12 20:15:25 INFO SparkDeploySchedulerBackend: Registered executor:
Actor[akka.tcp://sparkExecutor@jsimao71-acer:4316/user/Executor#-1003079982]
with ID 0
14/10/12 20:15:25 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
0, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:25 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID
1, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:26 INFO BlockManagerMasterActor: Registering block manager
jsimao71-acer:4335 with 265.1 MB RAM
14/10/12 20:15:27 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
jsimao71-acer): java.lang.NullPointerException:
        java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
        org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
        org.apache.hadoop.util.Shell.run(Shell.java:418)

org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
        org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
        org.apache.spark.util.Utils$.fetchFile(Utils.scala:448)

org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:325)

org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:323)

scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)

scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        scala.collection.mutable.HashMap.foreach(HashMap.scala:98)

scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        org.apache.spark.executor.Executor.org
$apache$spark$executor$Executor$$updateDependencies(Executor.scala:323)

org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        java.lang.Thread.run(Thread.java:745)
14/10/12 20:15:27 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID
2, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:27 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0)
on executor jsimao71-acer: java.lang.NullPointerException (null) [duplicate
1]
14/10/12 20:15:27 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID
3, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:27 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 2)
on executor jsimao71-acer: java.lang.NullPointerException (null) [duplicate
2]
14/10/12 20:15:27 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID
4, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:27 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 3)
on executor jsimao71-acer: java.lang.NullPointerException (null) [duplicate
3]
14/10/12 20:15:27 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID
5, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:27 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 4)
on executor jsimao71-acer: java.lang.NullPointerException (null) [duplicate
4]
14/10/12 20:15:27 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID
6, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:27 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 5)
on executor jsimao71-acer: java.lang.NullPointerException (null) [duplicate
5]
14/10/12 20:15:27 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID
7, jsimao71-acer, PROCESS_LOCAL, 1303 bytes)
14/10/12 20:15:27 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 6)
on executor jsimao71-acer: java.lang.NullPointerException (null) [duplicate
6]
14/10/12 20:15:27 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times;
aborting job
14/10/12 20:15:27 INFO TaskSchedulerImpl: Cancelling stage 0
14/10/12 20:15:27 INFO TaskSchedulerImpl: Stage 0 was cancelled
14/10/12 20:15:27 INFO DAGScheduler: Failed to run count at
SparkTests.java:48
Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure:
Lost task 1.3 in stage 0.0 (TID 6, jsimao71-acer):
java.lang.NullPointerException:
        java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
        org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
        org.apache.hadoop.util.Shell.run(Shell.java:418)

org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
        org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
        org.apache.spark.util.Utils$.fetchFile(Utils.scala:448)

org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:325)

org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$6.apply(Executor.scala:323)

scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)

scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        scala.collection.mutable.HashMap.foreach(HashMap.scala:98)

scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        org.apache.spark.executor.Executor.org
$apache$spark$executor$Executor$$updateDependencies(Executor.scala:323)

org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
    at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
    at scala.Option.foreach(Option.scala:236)
    at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
    at
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


Looking at source code (.scala and .java seams that It could be a targetDir
that is set to null - but not sure.)

Please help....

Thanks a lot,
Jorge.

Reply via email to