Re: Spark and HDFS

Naveen Madhire Wed, 15 Jul 2015 09:41:33 -0700

Yes. I did this recently. You need to copy the cloudera cluster related
conf files into the local machine
and set HADOOP_CONF_DIR or YARN_CONF_DIR.


And also local machine should be able to ssh to the cloudera cluster.

On Wed, Jul 15, 2015 at 8:51 AM, ayan guha <guha.a...@gmail.com> wrote:

> Assuming you run spark locally (ie either local mode or standalone cluster
> on your localm/c)
> 1. You need to have hadoop binaries locally
> 2. You need to have hdfs-site on Spark Classpath of your local m/c
>
> I would suggest you to start off with local files to play around.
>
> If you need to run spark on CDH cluster using Yarn, then you need to use
> spark-submit to yarn cluster. You can see a very good example here:
> https://spark.apache.org/docs/latest/running-on-yarn.html
>
>
>
> On Wed, Jul 15, 2015 at 10:36 PM, Jeskanen, Elina <elina.jeska...@cgi.com>
> wrote:
>
>>  I have Spark 1.4 on my local machine and I would like to connect to our
>> local 4 nodes Cloudera cluster. But how?
>>
>>
>>
>> In the example it says text_file = spark.textFile("hdfs://..."), but can
>> you advise me in where to get this "hdfs://..." -address?
>>
>>
>>
>> Thanks!
>>
>>
>>
>> Elina
>>
>>
>>
>>
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>

Re: Spark and HDFS

Reply via email to