Hi Andrew, Thanks for the response. I believe I have HDFS set up correctly: all my slaves can access it fine, and I can access the files I'm storing there with ~/persistent-hdfs/bin/hadoop -ls, etc. However, when I do spark-submit --master local, I still get the following error:
14/07/18 16:20:58 ERROR executor.Executor: Exception in task ID 0 java.io.IOException: No such file or directory Thanks for any help, Chris On Thu, Jul 17, 2014 at 6:42 PM, Andrew Or <[email protected]> wrote: > Hi Chris, > > Did you ever figure this out? It should just work provided that your HDFS > is set up correctly. If you don't call setMaster, it actually uses the > spark://[master-node-ip]:7077 by default (this is configured in your > conf/spark-env.sh). However, even if you use a local master, it should > still work (I just tried this on my own EC2 cluster). By the way, > SPARK_MASTER is actually deprecated. Instead, please use bin/spark-submit > --master [your master]. > > Andrew > > > 2014-07-16 23:46 GMT-07:00 Akhil Das <[email protected]>: > > You can try the following in the spark-shell: >> >> 1. Run it in *Clustermode* by going inside the spark directory: >> >> $ SPARK_MASTER=spark://masterip:7077 ./bin/spark-shell >> >> val textFile = sc.textFile("hdfs://masterip/data/blah.csv") >> >> textFile.take(10).foreach(println) >> >> >> 2. Now try running in *Localmode:* >> >> $ SPARK_MASTER=local ./bin/spark-shell >> >> val textFile = sc.textFile("hdfs://masterip/data/blah.csv") >> >> textFile.take(10).foreach(println) >> >> >> Both should print the first 10 lines from your blah.csv file. >> >> >> >> >
