Going back to your code, I see that you instantiate the spark context as:
  val sc = new SparkContext(args(0), "Csv loading example")
which will set the master url to "args(0)" and app name to "Csv loading
example". In your case, args(0) is
"hdfs://quickstart.cloudera:8020/people_csv", which obviously is not the
master url, so that is why you are getting the error.

There are two ways to fix this:
1. Add master url to the command line args:
spark-submit --master yarn --class org.spark.apache.CsvDataSource
/home/cloudera/Desktop/TestMain.jar yarn
hdfs://quickstart.cloudera:8020/people_csv

2. Use the no arg SparkContext constructor
I would recommend this since you are using spark-submit, which can set the
master url and app name properties. You would have to change your code as
"val sc = new SparkContext()" use the "--name" option for spark-submit.
Also, you would have to change your code for setting the csv file path using
"arg(0)" (since there is only one command line argument, indexed from 0).
spark-submit --master yarn --name "Csv loading example" --class
org.spark.apache.CsvDataSource /home/cloudera/Desktop/TestMain.jar
hdfs://quickstart.cloudera:8020/people_csv

Lastly, if you look at this documentation:
http://spark.apache.org/docs/latest/submitting-applications.html#master-urls,
"yarn" is not a valid master url. It looks like you need to use
"yarn-client" or "yarn-cluster". Unfortunately, I do not have experience
using yarn, so can't help you there. Here is more documentation for yarn you
can read: http://spark.apache.org/docs/latest/running-on-yarn.html.

-Nick



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-SQL-Error-tp25050p25078.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to