Going back to your code, I see that you instantiate the spark context as: val sc = new SparkContext(args(0), "Csv loading example") which will set the master url to "args(0)" and app name to "Csv loading example". In your case, args(0) is "hdfs://quickstart.cloudera:8020/people_csv", which obviously is not the master url, so that is why you are getting the error.
There are two ways to fix this: 1. Add master url to the command line args: spark-submit --master yarn --class org.spark.apache.CsvDataSource /home/cloudera/Desktop/TestMain.jar yarn hdfs://quickstart.cloudera:8020/people_csv 2. Use the no arg SparkContext constructor I would recommend this since you are using spark-submit, which can set the master url and app name properties. You would have to change your code as "val sc = new SparkContext()" use the "--name" option for spark-submit. Also, you would have to change your code for setting the csv file path using "arg(0)" (since there is only one command line argument, indexed from 0). spark-submit --master yarn --name "Csv loading example" --class org.spark.apache.CsvDataSource /home/cloudera/Desktop/TestMain.jar hdfs://quickstart.cloudera:8020/people_csv Lastly, if you look at this documentation: http://spark.apache.org/docs/latest/submitting-applications.html#master-urls, "yarn" is not a valid master url. It looks like you need to use "yarn-client" or "yarn-cluster". Unfortunately, I do not have experience using yarn, so can't help you there. Here is more documentation for yarn you can read: http://spark.apache.org/docs/latest/running-on-yarn.html. -Nick -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-SQL-Error-tp25050p25078.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org