Re: Running Spark on EMR
On Sun, Jan 15, 2017 at 11:09 AM, Andrew Holway < andrew.hol...@otternetworks.de> wrote: > use yarn :) > > "spark-submit --master yarn" > Doesn't this require first copying out various Hadoop configuration XML files from the EMR master node to the machine running the spark-submit? Or is there a well-known minimal set of host/port options to avoid that? I'm currently copying out several XML files and using them on a client running spark-submit, but I feel uneasy about this as it seems like the local values override values on the cluster at runtime -- they're copied up with the job. > > > On Sun, Jan 15, 2017 at 7:55 PM, Darren Govoni > wrote: > >> So what was the answer? >> >> >> >> Sent from my Verizon, Samsung Galaxy smartphone >> >> Original message >> From: Andrew Holway >> Date: 1/15/17 11:37 AM (GMT-05:00) >> To: Marco Mistroni >> Cc: Neil Jonkers , User >> Subject: Re: Running Spark on EMR >> >> Darn. I didn't respond to the list. Sorry. >> >> >> >> On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni >> wrote: >> >>> thanks Neil. I followed original suggestion from Andrw and everything is >>> working fine now >>> kr >>> >>> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers >>> wrote: >>> >>>> Hello, >>>> >>>> Can you drop the url: >>>> >>>> spark://master:7077 >>>> >>>> The url is used when running Spark in standalone mode. >>>> >>>> Regards >>>> >>>> >>>> Original message >>>> From: Marco Mistroni >>>> Date:15/01/2017 16:34 (GMT+02:00) >>>> To: User >>>> Subject: Running Spark on EMR >>>> >>>> hi all >>>> could anyone assist here? >>>> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues >>>> connecting to the master node >>>> So, below is a snippet of what i am doing >>>> >>>> >>>> sc = SparkSession.builder.master(sparkHost).appName("DataProcess" >>>> ).getOrCreate() >>>> >>>> sparkHost is passed as input parameter. that was thought so that i can >>>> run the script locally >>>> on my spark local instance as well as submitting scripts on any cluster >>>> i want >>>> >>>> >>>> Now i have >>>> 1 - setup a cluster on EMR. >>>> 2 - connected to masternode >>>> 3 - launch the command spark-submit myscripts.py spark://master:7077 >>>> >>>> But that results in an connection refused exception >>>> Then i have tried to remove the .master call above and launch the >>>> script with the following command >>>> >>>> spark-submit --master spark://master:7077 myscript.py but still i >>>> am getting >>>> connectionREfused exception >>>> >>>> >>>> I am using Spark 2.0.0 , could anyone advise on how shall i build the >>>> spark session and how can i submit a pythjon script to the cluster? >>>> >>>> kr >>>> marco >>>> >>> >>> >> >> >> -- >> Otter Networks UG >> http://otternetworks.de >> Gotenstraße 17 >> 10829 Berlin >> > > > > -- > Otter Networks UG > http://otternetworks.de > Gotenstraße 17 > 10829 Berlin >
Re: Running Spark on EMR
use yarn :) "spark-submit --master yarn" On Sun, Jan 15, 2017 at 7:55 PM, Darren Govoni wrote: > So what was the answer? > > > > Sent from my Verizon, Samsung Galaxy smartphone > > Original message > From: Andrew Holway > Date: 1/15/17 11:37 AM (GMT-05:00) > To: Marco Mistroni > Cc: Neil Jonkers , User > Subject: Re: Running Spark on EMR > > Darn. I didn't respond to the list. Sorry. > > > > On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni > wrote: > >> thanks Neil. I followed original suggestion from Andrw and everything is >> working fine now >> kr >> >> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers >> wrote: >> >>> Hello, >>> >>> Can you drop the url: >>> >>> spark://master:7077 >>> >>> The url is used when running Spark in standalone mode. >>> >>> Regards >>> >>> >>> Original message >>> From: Marco Mistroni >>> Date:15/01/2017 16:34 (GMT+02:00) >>> To: User >>> Subject: Running Spark on EMR >>> >>> hi all >>> could anyone assist here? >>> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues >>> connecting to the master node >>> So, below is a snippet of what i am doing >>> >>> >>> sc = SparkSession.builder.master(sparkHost).appName("DataProcess" >>> ).getOrCreate() >>> >>> sparkHost is passed as input parameter. that was thought so that i can >>> run the script locally >>> on my spark local instance as well as submitting scripts on any cluster >>> i want >>> >>> >>> Now i have >>> 1 - setup a cluster on EMR. >>> 2 - connected to masternode >>> 3 - launch the command spark-submit myscripts.py spark://master:7077 >>> >>> But that results in an connection refused exception >>> Then i have tried to remove the .master call above and launch the script >>> with the following command >>> >>> spark-submit --master spark://master:7077 myscript.py but still i am >>> getting >>> connectionREfused exception >>> >>> >>> I am using Spark 2.0.0 , could anyone advise on how shall i build the >>> spark session and how can i submit a pythjon script to the cluster? >>> >>> kr >>> marco >>> >> >> > > > -- > Otter Networks UG > http://otternetworks.de > Gotenstraße 17 > 10829 Berlin > -- Otter Networks UG http://otternetworks.de Gotenstraße 17 10829 Berlin
Re: Running Spark on EMR
So what was the answer? Sent from my Verizon, Samsung Galaxy smartphone Original message From: Andrew Holway Date: 1/15/17 11:37 AM (GMT-05:00) To: Marco Mistroni Cc: Neil Jonkers , User Subject: Re: Running Spark on EMR Darn. I didn't respond to the list. Sorry. On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni wrote: thanks Neil. I followed original suggestion from Andrw and everything is working fine nowkr On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers wrote: Hello, Can you drop the url: spark://master:7077 The url is used when running Spark in standalone mode. Regards Original message From: Marco Mistroni Date:15/01/2017 16:34 (GMT+02:00) To: User Subject: Running Spark on EMR hi all could anyone assist here?i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues connecting to the master nodeSo, below is a snippet of what i am doing sc = SparkSession.builder.master(sparkHost).appName("DataProcess").getOrCreate() sparkHost is passed as input parameter. that was thought so that i can run the script locallyon my spark local instance as well as submitting scripts on any cluster i want Now i have 1 - setup a cluster on EMR. 2 - connected to masternode3 - launch the command spark-submit myscripts.py spark://master:7077 But that results in an connection refused exceptionThen i have tried to remove the .master call above and launch the script with the following command spark-submit --master spark://master:7077 myscript.py but still i am gettingconnectionREfused exception I am using Spark 2.0.0 , could anyone advise on how shall i build the spark session and how can i submit a pythjon script to the cluster? kr marco -- Otter Networks UG http://otternetworks.de Gotenstraße 17 10829 Berlin
Re: Running Spark on EMR
Darn. I didn't respond to the list. Sorry. On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni wrote: > thanks Neil. I followed original suggestion from Andrw and everything is > working fine now > kr > > On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers wrote: > >> Hello, >> >> Can you drop the url: >> >> spark://master:7077 >> >> The url is used when running Spark in standalone mode. >> >> Regards >> >> >> Original message ---- >> From: Marco Mistroni >> Date:15/01/2017 16:34 (GMT+02:00) >> To: User >> Subject: Running Spark on EMR >> >> hi all >> could anyone assist here? >> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues >> connecting to the master node >> So, below is a snippet of what i am doing >> >> >> sc = SparkSession.builder.master(sparkHost).appName("DataProcess" >> ).getOrCreate() >> >> sparkHost is passed as input parameter. that was thought so that i can >> run the script locally >> on my spark local instance as well as submitting scripts on any cluster i >> want >> >> >> Now i have >> 1 - setup a cluster on EMR. >> 2 - connected to masternode >> 3 - launch the command spark-submit myscripts.py spark://master:7077 >> >> But that results in an connection refused exception >> Then i have tried to remove the .master call above and launch the script >> with the following command >> >> spark-submit --master spark://master:7077 myscript.py but still i am >> getting >> connectionREfused exception >> >> >> I am using Spark 2.0.0 , could anyone advise on how shall i build the >> spark session and how can i submit a pythjon script to the cluster? >> >> kr >> marco >> > > -- Otter Networks UG http://otternetworks.de Gotenstraße 17 10829 Berlin
Re: Running Spark on EMR
thanks Neil. I followed original suggestion from Andrw and everything is working fine now kr On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers wrote: > Hello, > > Can you drop the url: > > spark://master:7077 > > The url is used when running Spark in standalone mode. > > Regards > > > Original message > From: Marco Mistroni > Date:15/01/2017 16:34 (GMT+02:00) > To: User > Subject: Running Spark on EMR > > hi all > could anyone assist here? > i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues > connecting to the master node > So, below is a snippet of what i am doing > > > sc = SparkSession.builder.master(sparkHost).appName(" > DataProcess").getOrCreate() > > sparkHost is passed as input parameter. that was thought so that i can run > the script locally > on my spark local instance as well as submitting scripts on any cluster i > want > > > Now i have > 1 - setup a cluster on EMR. > 2 - connected to masternode > 3 - launch the command spark-submit myscripts.py spark://master:7077 > > But that results in an connection refused exception > Then i have tried to remove the .master call above and launch the script > with the following command > > spark-submit --master spark://master:7077 myscript.py but still i am > getting > connectionREfused exception > > > I am using Spark 2.0.0 , could anyone advise on how shall i build the > spark session and how can i submit a pythjon script to the cluster? > > kr > marco >
Re: Running Spark on EMR
Hello, Can you drop the url: spark://master:7077 The url is used when running Spark in standalone mode. Regards Original message From: Marco Mistroni Date:15/01/2017 16:34 (GMT+02:00) To: User Subject: Running Spark on EMR hi all could anyone assist here? i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues connecting to the master node So, below is a snippet of what i am doing sc = SparkSession.builder.master(sparkHost).appName("DataProcess").getOrCreate() sparkHost is passed as input parameter. that was thought so that i can run the script locally on my spark local instance as well as submitting scripts on any cluster i want Now i have 1 - setup a cluster on EMR. 2 - connected to masternode 3 - launch the command spark-submit myscripts.py spark://master:7077 But that results in an connection refused exception Then i have tried to remove the .master call above and launch the script with the following command spark-submit --master spark://master:7077 myscript.py but still i am getting connectionREfused exception I am using Spark 2.0.0 , could anyone advise on how shall i build the spark session and how can i submit a pythjon script to the cluster? kr marco
Running Spark on EMR
hi all could anyone assist here? i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues connecting to the master node So, below is a snippet of what i am doing sc = SparkSession.builder.master(sparkHost).appName("DataProcess").getOrCreate() sparkHost is passed as input parameter. that was thought so that i can run the script locally on my spark local instance as well as submitting scripts on any cluster i want Now i have 1 - setup a cluster on EMR. 2 - connected to masternode 3 - launch the command spark-submit myscripts.py spark://master:7077 But that results in an connection refused exception Then i have tried to remove the .master call above and launch the script with the following command spark-submit --master spark://master:7077 myscript.py but still i am getting connectionREfused exception I am using Spark 2.0.0 , could anyone advise on how shall i build the spark session and how can i submit a pythjon script to the cluster? kr marco