Hi Chris,

Did you ever figure this out? It should just work provided that your HDFS
is set up correctly. If you don't call setMaster, it actually uses the
spark://[master-node-ip]:7077 by default (this is configured in your
conf/spark-env.sh). However, even if you use a local master, it should
still work (I just tried this on my own EC2 cluster). By the way,
SPARK_MASTER is actually deprecated. Instead, please use bin/spark-submit
--master [your master].

Andrew


2014-07-16 23:46 GMT-07:00 Akhil Das <ak...@sigmoidanalytics.com>:

> You can try the following in the spark-shell:
>
> 1. Run it in *Clustermode* by going inside the spark directory:
>
> $ SPARK_MASTER=spark://masterip:7077 ./bin/spark-shell
>
> val textFile = sc.textFile("hdfs://masterip/data/blah.csv")
>
> textFile.take(10).foreach(println)
>
>
> 2. Now try running in *Localmode:*
>
> $ SPARK_MASTER=local ./bin/spark-shell
>
> val textFile = sc.textFile("hdfs://masterip/data/blah.csv")
>
> textFile.take(10).foreach(println)
>
>
> ​Both shoul​d print the first 10 lines from your blah.csv file.
>
>
>
>

Reply via email to