Re: Running Spark on EMR

2017-01-16 Thread Everett Anderson
On Sun, Jan 15, 2017 at 11:09 AM, Andrew Holway <
andrew.hol...@otternetworks.de> wrote:

> use yarn :)
>
> "spark-submit --master yarn"
>

Doesn't this require first copying out various Hadoop configuration XML
files from the EMR master node to the machine running the spark-submit? Or
is there a well-known minimal set of host/port options to avoid that?

I'm currently copying out several XML files and using them on a client
running spark-submit, but I feel uneasy about this as it seems like the
local values override values on the cluster at runtime -- they're copied up
with the job.




>
>
> On Sun, Jan 15, 2017 at 7:55 PM, Darren Govoni <dar...@ontrenet.com>
> wrote:
>
>> So what was the answer?
>>
>>
>>
>> Sent from my Verizon, Samsung Galaxy smartphone
>>
>>  Original message 
>> From: Andrew Holway <andrew.hol...@otternetworks.de>
>> Date: 1/15/17 11:37 AM (GMT-05:00)
>> To: Marco Mistroni <mmistr...@gmail.com>
>> Cc: Neil Jonkers <neilod...@gmail.com>, User <user@spark.apache.org>
>> Subject: Re: Running Spark on EMR
>>
>> Darn. I didn't respond to the list. Sorry.
>>
>>
>>
>> On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni <mmistr...@gmail.com>
>> wrote:
>>
>>> thanks Neil. I followed original suggestion from Andrw and everything is
>>> working fine now
>>> kr
>>>
>>> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers <neilod...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> Can you drop the url:
>>>>
>>>>  spark://master:7077
>>>>
>>>> The url is used when running Spark in standalone mode.
>>>>
>>>> Regards
>>>>
>>>>
>>>>  Original message 
>>>> From: Marco Mistroni
>>>> Date:15/01/2017 16:34 (GMT+02:00)
>>>> To: User
>>>> Subject: Running Spark on EMR
>>>>
>>>> hi all
>>>>  could anyone assist here?
>>>> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues
>>>> connecting to the master node
>>>> So, below is a snippet of what i am doing
>>>>
>>>>
>>>> sc = SparkSession.builder.master(sparkHost).appName("DataProcess"
>>>> ).getOrCreate()
>>>>
>>>> sparkHost is passed as input parameter. that was thought so that i can
>>>> run the script locally
>>>> on my spark local instance as well as submitting scripts on any cluster
>>>> i want
>>>>
>>>>
>>>> Now i have
>>>> 1 - setup a cluster on EMR.
>>>> 2 - connected to masternode
>>>> 3  - launch the command spark-submit myscripts.py spark://master:7077
>>>>
>>>> But that results in an connection refused exception
>>>> Then i have tried to remove the .master call above and launch the
>>>> script with the following command
>>>>
>>>> spark-submit --master spark://master:7077   myscript.py  but still i
>>>> am getting
>>>> connectionREfused exception
>>>>
>>>>
>>>> I am using Spark 2.0.0 , could anyone advise on how shall i build the
>>>> spark session and how can i submit a pythjon script to the cluster?
>>>>
>>>> kr
>>>>  marco
>>>>
>>>
>>>
>>
>>
>> --
>> Otter Networks UG
>> http://otternetworks.de
>> Gotenstraße 17
>> 10829 Berlin
>>
>
>
>
> --
> Otter Networks UG
> http://otternetworks.de
> Gotenstraße 17
> 10829 Berlin
>


Re: Running Spark on EMR

2017-01-15 Thread Andrew Holway
use yarn :)

"spark-submit --master yarn"

On Sun, Jan 15, 2017 at 7:55 PM, Darren Govoni <dar...@ontrenet.com> wrote:

> So what was the answer?
>
>
>
> Sent from my Verizon, Samsung Galaxy smartphone
>
>  Original message 
> From: Andrew Holway <andrew.hol...@otternetworks.de>
> Date: 1/15/17 11:37 AM (GMT-05:00)
> To: Marco Mistroni <mmistr...@gmail.com>
> Cc: Neil Jonkers <neilod...@gmail.com>, User <user@spark.apache.org>
> Subject: Re: Running Spark on EMR
>
> Darn. I didn't respond to the list. Sorry.
>
>
>
> On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni <mmistr...@gmail.com>
> wrote:
>
>> thanks Neil. I followed original suggestion from Andrw and everything is
>> working fine now
>> kr
>>
>> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers <neilod...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> Can you drop the url:
>>>
>>>  spark://master:7077
>>>
>>> The url is used when running Spark in standalone mode.
>>>
>>> Regards
>>>
>>>
>>>  Original message 
>>> From: Marco Mistroni
>>> Date:15/01/2017 16:34 (GMT+02:00)
>>> To: User
>>> Subject: Running Spark on EMR
>>>
>>> hi all
>>>  could anyone assist here?
>>> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues
>>> connecting to the master node
>>> So, below is a snippet of what i am doing
>>>
>>>
>>> sc = SparkSession.builder.master(sparkHost).appName("DataProcess"
>>> ).getOrCreate()
>>>
>>> sparkHost is passed as input parameter. that was thought so that i can
>>> run the script locally
>>> on my spark local instance as well as submitting scripts on any cluster
>>> i want
>>>
>>>
>>> Now i have
>>> 1 - setup a cluster on EMR.
>>> 2 - connected to masternode
>>> 3  - launch the command spark-submit myscripts.py spark://master:7077
>>>
>>> But that results in an connection refused exception
>>> Then i have tried to remove the .master call above and launch the script
>>> with the following command
>>>
>>> spark-submit --master spark://master:7077   myscript.py  but still i am
>>> getting
>>> connectionREfused exception
>>>
>>>
>>> I am using Spark 2.0.0 , could anyone advise on how shall i build the
>>> spark session and how can i submit a pythjon script to the cluster?
>>>
>>> kr
>>>  marco
>>>
>>
>>
>
>
> --
> Otter Networks UG
> http://otternetworks.de
> Gotenstraße 17
> 10829 Berlin
>



-- 
Otter Networks UG
http://otternetworks.de
Gotenstraße 17
10829 Berlin


Re: Running Spark on EMR

2017-01-15 Thread Darren Govoni
So what was the answer?


Sent from my Verizon, Samsung Galaxy smartphone
 Original message From: Andrew Holway 
<andrew.hol...@otternetworks.de> Date: 1/15/17  11:37 AM  (GMT-05:00) To: Marco 
Mistroni <mmistr...@gmail.com> Cc: Neil Jonkers <neilod...@gmail.com>, User 
<user@spark.apache.org> Subject: Re: Running Spark on EMR 
Darn. I didn't respond to the list. Sorry.


On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni <mmistr...@gmail.com> wrote:
thanks Neil. I followed original suggestion from Andrw and everything is 
working fine nowkr
On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers <neilod...@gmail.com> wrote:
Hello,
Can you drop the url:
 spark://master:7077
The url is used when running Spark in standalone mode.
Regards

 Original message From: Marco Mistroni  Date:15/01/2017  16:34  
(GMT+02:00) To: User  Subject: Running Spark on EMR 
hi all could anyone assist here?i am trying to run spark 2.0.0 on an EMR 
cluster,but i am having issues connecting to the master nodeSo, below is a 
snippet of what i am doing

sc = SparkSession.builder.master(sparkHost).appName("DataProcess").getOrCreate()

sparkHost is passed as input parameter. that was thought so that i can run the 
script locallyon my spark local instance as well as submitting scripts on any 
cluster i want

Now i have 1 - setup a cluster on EMR. 2 - connected to masternode3  - launch 
the command spark-submit myscripts.py spark://master:7077
But that results in an connection refused exceptionThen i have tried to remove 
the .master call above and launch the script with the following command
spark-submit --master spark://master:7077   myscript.py  but still i am 
gettingconnectionREfused exception

I am using Spark 2.0.0 , could anyone advise on how shall i build the spark 
session and how can i submit a pythjon script to the cluster?
kr marco  





-- 
Otter Networks UG
http://otternetworks.de
Gotenstraße 17
10829 Berlin



Re: Running Spark on EMR

2017-01-15 Thread Andrew Holway
Darn. I didn't respond to the list. Sorry.



On Sun, Jan 15, 2017 at 5:29 PM, Marco Mistroni  wrote:

> thanks Neil. I followed original suggestion from Andrw and everything is
> working fine now
> kr
>
> On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers  wrote:
>
>> Hello,
>>
>> Can you drop the url:
>>
>>  spark://master:7077
>>
>> The url is used when running Spark in standalone mode.
>>
>> Regards
>>
>>
>>  Original message 
>> From: Marco Mistroni
>> Date:15/01/2017 16:34 (GMT+02:00)
>> To: User
>> Subject: Running Spark on EMR
>>
>> hi all
>>  could anyone assist here?
>> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues
>> connecting to the master node
>> So, below is a snippet of what i am doing
>>
>>
>> sc = SparkSession.builder.master(sparkHost).appName("DataProcess"
>> ).getOrCreate()
>>
>> sparkHost is passed as input parameter. that was thought so that i can
>> run the script locally
>> on my spark local instance as well as submitting scripts on any cluster i
>> want
>>
>>
>> Now i have
>> 1 - setup a cluster on EMR.
>> 2 - connected to masternode
>> 3  - launch the command spark-submit myscripts.py spark://master:7077
>>
>> But that results in an connection refused exception
>> Then i have tried to remove the .master call above and launch the script
>> with the following command
>>
>> spark-submit --master spark://master:7077   myscript.py  but still i am
>> getting
>> connectionREfused exception
>>
>>
>> I am using Spark 2.0.0 , could anyone advise on how shall i build the
>> spark session and how can i submit a pythjon script to the cluster?
>>
>> kr
>>  marco
>>
>
>


-- 
Otter Networks UG
http://otternetworks.de
Gotenstraße 17
10829 Berlin


Re: Running Spark on EMR

2017-01-15 Thread Marco Mistroni
thanks Neil. I followed original suggestion from Andrw and everything is
working fine now
kr

On Sun, Jan 15, 2017 at 4:27 PM, Neil Jonkers  wrote:

> Hello,
>
> Can you drop the url:
>
>  spark://master:7077
>
> The url is used when running Spark in standalone mode.
>
> Regards
>
>
>  Original message 
> From: Marco Mistroni
> Date:15/01/2017 16:34 (GMT+02:00)
> To: User
> Subject: Running Spark on EMR
>
> hi all
>  could anyone assist here?
> i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues
> connecting to the master node
> So, below is a snippet of what i am doing
>
>
> sc = SparkSession.builder.master(sparkHost).appName("
> DataProcess").getOrCreate()
>
> sparkHost is passed as input parameter. that was thought so that i can run
> the script locally
> on my spark local instance as well as submitting scripts on any cluster i
> want
>
>
> Now i have
> 1 - setup a cluster on EMR.
> 2 - connected to masternode
> 3  - launch the command spark-submit myscripts.py spark://master:7077
>
> But that results in an connection refused exception
> Then i have tried to remove the .master call above and launch the script
> with the following command
>
> spark-submit --master spark://master:7077   myscript.py  but still i am
> getting
> connectionREfused exception
>
>
> I am using Spark 2.0.0 , could anyone advise on how shall i build the
> spark session and how can i submit a pythjon script to the cluster?
>
> kr
>  marco
>


Re: Running Spark on EMR

2017-01-15 Thread Neil Jonkers
Hello,

Can you drop the url:

 spark://master:7077

The url is used when running Spark in standalone mode.

Regards

 Original message From: Marco Mistroni 
 Date:15/01/2017  16:34  (GMT+02:00) 
To: User  Subject: Running Spark 
on EMR 
hi all
 could anyone assist here?
i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues 
connecting to the master node
So, below is a snippet of what i am doing


sc = SparkSession.builder.master(sparkHost).appName("DataProcess").getOrCreate()

sparkHost is passed as input parameter. that was thought so that i can run the 
script locally
on my spark local instance as well as submitting scripts on any cluster i want


Now i have 
1 - setup a cluster on EMR. 
2 - connected to masternode
3  - launch the command spark-submit myscripts.py spark://master:7077

But that results in an connection refused exception
Then i have tried to remove the .master call above and launch the script with 
the following command

spark-submit --master spark://master:7077   myscript.py  but still i am getting
connectionREfused exception


I am using Spark 2.0.0 , could anyone advise on how shall i build the spark 
session and how can i submit a pythjon script to the cluster?

kr
 marco