On Sun, Jan 15, 2017 at 11:09 AM, Andrew Holway < andrew.hol...@otternetworks.de> wrote:
> use yarn :) > > "spark-submit --master yarn" > Doesn't this require first copying out various Hadoop configuration XML files from the EMR master node to the machine running the spark-submit? Or is there a well-known minimal set of host/port options to avoid that? I'm currently copying out several XML files and using them on a client running spark-submit, but I feel uneasy about this as it seems like the local values override values on the cluster at runtime -- they're copied up with the job. > > > On Sun, Jan 15, 2017 at 7:55 PM, Darren Govoni <dar...@ontrenet.com> > wrote: > >> So what was the answer? >> >> >> >> Sent from my Verizon, Samsung Galaxy smartphone >> >> -------- Original message -------- >> From: Andrew Holway <andrew.hol...@otternetworks.de> >> Date: 1/15/17 11:37 AM (GMT-05:00) >> To: Marco Mistroni <mmistr...@gmail.com> >> Cc: Neil Jonkers <neilod...@gmail.com>, User <user@spark.apache.org> >> Subject: Re: Running Spark on EMR >> >> Darn. I didn't respond to the list. Sorry. >> >> >> >> On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni <mmistr...@gmail.com> >> wrote: >> >>> thanks Neil. I followed original suggestion from Andrw and everything is >>> working fine now >>> kr >>> >>> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers <neilod...@gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> Can you drop the url: >>>> >>>> spark://master:7077 >>>> >>>> The url is used when running Spark in standalone mode. >>>> >>>> Regards >>>> >>>> >>>> -------- Original message -------- >>>> From: Marco Mistroni >>>> Date:15/01/2017 16:34 (GMT+02:00) >>>> To: User >>>> Subject: Running Spark on EMR >>>> >>>> hi all >>>> could anyone assist here? >>>> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues >>>> connecting to the master node >>>> So, below is a snippet of what i am doing >>>> >>>> >>>> sc = SparkSession.builder.master(sparkHost).appName("DataProcess" >>>> ).getOrCreate() >>>> >>>> sparkHost is passed as input parameter. that was thought so that i can >>>> run the script locally >>>> on my spark local instance as well as submitting scripts on any cluster >>>> i want >>>> >>>> >>>> Now i have >>>> 1 - setup a cluster on EMR. >>>> 2 - connected to masternode >>>> 3 - launch the command spark-submit myscripts.py spark://master:7077 >>>> >>>> But that results in an connection refused exception >>>> Then i have tried to remove the .master call above and launch the >>>> script with the following command >>>> >>>> spark-submit --master spark://master:7077 myscript.py but still i >>>> am getting >>>> connectionREfused exception >>>> >>>> >>>> I am using Spark 2.0.0 , could anyone advise on how shall i build the >>>> spark session and how can i submit a pythjon script to the cluster? >>>> >>>> kr >>>> marco >>>> >>> >>> >> >> >> -- >> Otter Networks UG >> http://otternetworks.de >> Gotenstraße 17 >> 10829 Berlin >> > > > > -- > Otter Networks UG > http://otternetworks.de > Gotenstraße 17 > 10829 Berlin >