Hi all, I'm manually building Spark from source against 1.4 branch and submitting the job against Yarn. I am seeing very strange behavior. The first 2 or 3 times I submit the job, it runs fine, computes Pi, and exits. The next time I run it, it gets stuck in the "ACCEPTED" state.
I'm kicking off a job using "yarn-client" mode like this: ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue thequeue examples/target/scala-2.10/spark-examples*.jar 10 Here's what ResourceManager shows:[image: Yarn ResourceManager UI] In Yarn ResourceManager logs, all I'm seeing is this: 2015-06-08 14:49:57,166 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1433789077942_0004_000001 to scheduler from user: root 2015-06-08 14:49:57,166 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1433789077942_0004_000001 State change from SUBMITTED to SCHEDULED There's nothing in the NodeManager logs (though its up and running), the job isn't getting that far. It seems to me that there's an issue somewhere between Spark 1.4 and Yarn integration. Hadoop runs without any issues. I've ran the below multiple times. yarn jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.4.2.jar pi 16 100 For reference, I'm compiling the source against 1.4 branch, and running it on a single-node cluster with CDH5.4 and Hadoop 2.6, distributed mode. I am using the following to compile: "mvn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Pyarn -Phive -Phive-thriftserver -DskipTests clean package" Any help appreciated. Thanks, -Matt
