1) Parameters like "--num-executors" should come before the jar. That is, you want something like$SPARK_HOME --num-executors 3 --driver-memory 6g --executor-memory 7g \--master yarn-cluster --class EDDApp target/scala-2.10/edd....jar \<outputPath> That is, *your* parameters come after the jar, spark's parameters come *before* the jar.That's how spark knows which are which (at least that is my understanding). 2‚ Double check that in your code, when you create the SparkContext or the configuration object, that you don't set local there.(I don't recall the exact order of priority if the parameters disagree with the code). Good luck! -Mike
From: kundan kumar <iitr.kun...@gmail.com> To: spark users <user@spark.apache.org> Sent: Wednesday, February 4, 2015 7:41 AM Subject: Spark Job running on localhost on yarn cluster Hi, I am trying to execute my code on a yarn cluster The command which I am using is $SPARK_HOME/bin/spark-submit --class "EDDApp" target/scala-2.10/edd-application_2.10-1.0.jar --master yarn-cluster --num-executors 3 --driver-memory 6g --executor-memory 7g <outpuPath> But, I can see that this program is running only on the localhost. Its able to read the file from hdfs. I have tried this in standalone mode and it works fine. Please suggest where is it going wrong. Regards,Kundan