1) Parameters like "--num-executors" should come before the jar. That is, you
want something like$SPARK_HOME --num-executors 3 --driver-memory 6g
--executor-memory 7g \--master yarn-cluster --class EDDApp
target/scala-2.10/edd....jar \<outputPath>
That is, *your* parameters come after the jar, spark's parameters come *before*
the jar.That's how spark knows which are which (at least that is my
understanding).
2‚ Double check that in your code, when you create the SparkContext or the
configuration object, that you don't set local there.(I don't recall the exact
order of priority if the parameters disagree with the code).
Good luck!
-Mike
From: kundan kumar <[email protected]>
To: spark users <[email protected]>
Sent: Wednesday, February 4, 2015 7:41 AM
Subject: Spark Job running on localhost on yarn cluster
Hi,
I am trying to execute my code on a yarn cluster
The command which I am using is
$SPARK_HOME/bin/spark-submit --class "EDDApp"
target/scala-2.10/edd-application_2.10-1.0.jar --master yarn-cluster
--num-executors 3 --driver-memory 6g --executor-memory 7g <outpuPath>
But, I can see that this program is running only on the localhost.
Its able to read the file from hdfs.
I have tried this in standalone mode and it works fine.
Please suggest where is it going wrong.
Regards,Kundan