Yes. I did this recently. You need to copy the cloudera cluster related conf files into the local machine and set HADOOP_CONF_DIR or YARN_CONF_DIR.
And also local machine should be able to ssh to the cloudera cluster. On Wed, Jul 15, 2015 at 8:51 AM, ayan guha <guha.a...@gmail.com> wrote: > Assuming you run spark locally (ie either local mode or standalone cluster > on your localm/c) > 1. You need to have hadoop binaries locally > 2. You need to have hdfs-site on Spark Classpath of your local m/c > > I would suggest you to start off with local files to play around. > > If you need to run spark on CDH cluster using Yarn, then you need to use > spark-submit to yarn cluster. You can see a very good example here: > https://spark.apache.org/docs/latest/running-on-yarn.html > > > > On Wed, Jul 15, 2015 at 10:36 PM, Jeskanen, Elina <elina.jeska...@cgi.com> > wrote: > >> I have Spark 1.4 on my local machine and I would like to connect to our >> local 4 nodes Cloudera cluster. But how? >> >> >> >> In the example it says text_file = spark.textFile("hdfs://..."), but can >> you advise me in where to get this "hdfs://..." -address? >> >> >> >> Thanks! >> >> >> >> Elina >> >> >> >> >> > > > > -- > Best Regards, > Ayan Guha >