Re: How best to install Spark?

2015-02-02 Thread Samuel Cozannet
Indeed...

I actually use that in other cases but for some reason I didn't get it
right this time :/. Thanks for catching this!

Best,
Sam

Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Tue, Feb 3, 2015 at 1:43 AM, Andrew Wilkins  wrote:

> On Mon, Feb 2, 2015 at 9:57 PM, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Excellent! Happy to help you through your discovery of awesomeness with
>> Juju :)
>>
>> Note that, if you have jq installed (which I advise, sudo apt-get install
>> jq)
>> juju stat | python -c 'import sys, yaml, json;
>> json.dump(yaml.load(sys.stdin), sys.stdout, indent=4)' | jq
>> '.services."".units."/0".machine' | tr -d
>> "\""
>>
>
> FYI, you can do "juju status --format=json" and skip the Python bit.
>
> will return the ID of the machine for  (replace that by
>> yarn-master or what ever the name you gave), which saves you the browsing
>> of several pages of juju status...
>>
>> Let us know how your testing goes!
>>
>> Best,
>> Sam
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Mon, Feb 2, 2015 at 2:52 PM, Ken Williams  wrote:
>>
>>> Hi Sam,
>>>
>>> Just to confirm that deploying the spark-master and the
>>> yarn-hdfs-master to
>>> the same machine seems to have worked !  :-)
>>>
>>> // use 'juju status' to find which machine yarn-hdfs-master is on
>>> juju status
>>> [ etc...]
>>> // say...machine: "4"
>>>
>>> // deploy spark-master to same machine
>>> juju deploy --to 4 spark-master
>>>
>>> // add relations
>>> juju add-relation yarn-hdfs-master:resourcemanager spark-master
>>> juju add-relation yarn-hdfs-master:namenode spark-master
>>>
>>>
>>> // run test
>>> root@ip-172-31-21-92:~# spark-submit --class
>>> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>> 25 --master yarn   --num-executors 3 --driver-memory 1g
>>> --executor-memory 1g --executor-cores 1 --deploy-mode cluster
>>> --queue thequeue /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>> Spark assembly has been built with Hive, including Datanucleus jars on
>>> classpath
>>> 15/02/02 13:40:45 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> Pi is roughly 3.1405888
>>>
>>>
>>> Many thanks again for all your help,
>>>
>>> Best Regards,
>>>
>>> Ken
>>>
>>>
>>>
>>> On 30 January 2015 at 18:11, Ken Williams  wrote:
>>>

 Ok - Sam, I'll try this and let you know.

 Thanks again for all your help,

 Best Regards,

 Ken



 On 30 January 2015 at 18:09, Samuel Cozannet <
 samuel.cozan...@canonical.com> wrote:

> I'll have a look asap, but probably not before Tuesday.
>
> This may be "my guts tell me that" but, if you have the time, try to
> collocate YARN and Spark, that will guarantee you have the YARN_CONF_DIR
> set. I am 90% sure it will fix your problem.
>
> YARN itself will not eat much resources, you should be alright and it
> may allow you to move forward instead of being stuck.
>
> Best,
> Sam
>
>
>
>
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 7:01 PM, Ken Williams  wrote:
>
>> Hi Sam,
>>
>> Attached is my bundles.yaml file.
>>
>> Also, there is no file 'directories.sh' on my spark-master/0
>> machine (see below),
>>
>> ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
>> total 12
>> -rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
>> -rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
>> -rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
>> ubuntu@ip-172-31-54-245:~$
>>
>>
>> Many thanks again your help,
>>
>> Ken
>>
>>
>> On 30 January 2015 at 15:45, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> Hey,
>>>
>>> can you send the bundle you're using (in the GUI, bottom right,
>>

Re: How best to install Spark?

2015-02-02 Thread Andrew Wilkins
On Mon, Feb 2, 2015 at 9:57 PM, Samuel Cozannet <
samuel.cozan...@canonical.com> wrote:

> Excellent! Happy to help you through your discovery of awesomeness with
> Juju :)
>
> Note that, if you have jq installed (which I advise, sudo apt-get install
> jq)
> juju stat | python -c 'import sys, yaml, json;
> json.dump(yaml.load(sys.stdin), sys.stdout, indent=4)' | jq
> '.services."".units."/0".machine' | tr -d "\""
>

FYI, you can do "juju status --format=json" and skip the Python bit.

will return the ID of the machine for  (replace that by
> yarn-master or what ever the name you gave), which saves you the browsing
> of several pages of juju status...
>
> Let us know how your testing goes!
>
> Best,
> Sam
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Mon, Feb 2, 2015 at 2:52 PM, Ken Williams  wrote:
>
>> Hi Sam,
>>
>> Just to confirm that deploying the spark-master and the
>> yarn-hdfs-master to
>> the same machine seems to have worked !  :-)
>>
>> // use 'juju status' to find which machine yarn-hdfs-master is on
>> juju status
>> [ etc...]
>> // say...machine: "4"
>>
>> // deploy spark-master to same machine
>> juju deploy --to 4 spark-master
>>
>> // add relations
>> juju add-relation yarn-hdfs-master:resourcemanager spark-master
>> juju add-relation yarn-hdfs-master:namenode spark-master
>>
>>
>> // run test
>> root@ip-172-31-21-92:~# spark-submit --class
>> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>> 25 --master yarn   --num-executors 3 --driver-memory 1g
>> --executor-memory 1g --executor-cores 1 --deploy-mode cluster
>> --queue thequeue /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>> Spark assembly has been built with Hive, including Datanucleus jars on
>> classpath
>> 15/02/02 13:40:45 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> Pi is roughly 3.1405888
>>
>>
>> Many thanks again for all your help,
>>
>> Best Regards,
>>
>> Ken
>>
>>
>>
>> On 30 January 2015 at 18:11, Ken Williams  wrote:
>>
>>>
>>> Ok - Sam, I'll try this and let you know.
>>>
>>> Thanks again for all your help,
>>>
>>> Best Regards,
>>>
>>> Ken
>>>
>>>
>>>
>>> On 30 January 2015 at 18:09, Samuel Cozannet <
>>> samuel.cozan...@canonical.com> wrote:
>>>
 I'll have a look asap, but probably not before Tuesday.

 This may be "my guts tell me that" but, if you have the time, try to
 collocate YARN and Spark, that will guarantee you have the YARN_CONF_DIR
 set. I am 90% sure it will fix your problem.

 YARN itself will not eat much resources, you should be alright and it
 may allow you to move forward instead of being stuck.

 Best,
 Sam






 Best,
 Samuel

 --
 Samuel Cozannet
 Cloud, Big Data and IoT Strategy Team
 Business Development - Cloud and ISV Ecosystem
 Changing the Future of Cloud
 Ubuntu   / Canonical UK LTD  /
 Juju 
 samuel.cozan...@canonical.com
 mob: +33 616 702 389
 skype: samnco
 Twitter: @SaMnCo_23

 On Fri, Jan 30, 2015 at 7:01 PM, Ken Williams  wrote:

> Hi Sam,
>
> Attached is my bundles.yaml file.
>
> Also, there is no file 'directories.sh' on my spark-master/0
> machine (see below),
>
> ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
> total 12
> -rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
> -rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
> -rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
> ubuntu@ip-172-31-54-245:~$
>
>
> Many thanks again your help,
>
> Ken
>
>
> On 30 January 2015 at 15:45, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Hey,
>>
>> can you send the bundle you're using (in the GUI, bottom right,
>> "export" button should give you a bundles.yaml file, please send that to
>> me, so I can bootstrap the same environment as you are playing with.
>>
>> also
>> * can you let me know if you have a file
>> /etc/profile.d/directories.sh?
>> * if yes, can you execute it from your command line, then do the
>> spark command again, and let me know?
>>
>> Thx,
>> Sam
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu 

Re: How best to install Spark?

2015-02-02 Thread Samuel Cozannet
Excellent! Happy to help you through your discovery of awesomeness with
Juju :)

Note that, if you have jq installed (which I advise, sudo apt-get install
jq)
juju stat | python -c 'import sys, yaml, json;
json.dump(yaml.load(sys.stdin), sys.stdout, indent=4)' | jq
'.services."".units."/0".machine' | tr -d "\""

will return the ID of the machine for  (replace that by
yarn-master or what ever the name you gave), which saves you the browsing
of several pages of juju status...

Let us know how your testing goes!

Best,
Sam


Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Mon, Feb 2, 2015 at 2:52 PM, Ken Williams  wrote:

> Hi Sam,
>
> Just to confirm that deploying the spark-master and the
> yarn-hdfs-master to
> the same machine seems to have worked !  :-)
>
> // use 'juju status' to find which machine yarn-hdfs-master is on
> juju status
> [ etc...]
> // say...machine: "4"
>
> // deploy spark-master to same machine
> juju deploy --to 4 spark-master
>
> // add relations
> juju add-relation yarn-hdfs-master:resourcemanager spark-master
> juju add-relation yarn-hdfs-master:namenode spark-master
>
>
> // run test
> root@ip-172-31-21-92:~# spark-submit --class
> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
> 25 --master yarn   --num-executors 3 --driver-memory 1g
> --executor-memory 1g --executor-cores 1 --deploy-mode cluster
> --queue thequeue /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> 15/02/02 13:40:45 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Pi is roughly 3.1405888
>
>
> Many thanks again for all your help,
>
> Best Regards,
>
> Ken
>
>
>
> On 30 January 2015 at 18:11, Ken Williams  wrote:
>
>>
>> Ok - Sam, I'll try this and let you know.
>>
>> Thanks again for all your help,
>>
>> Best Regards,
>>
>> Ken
>>
>>
>>
>> On 30 January 2015 at 18:09, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> I'll have a look asap, but probably not before Tuesday.
>>>
>>> This may be "my guts tell me that" but, if you have the time, try to
>>> collocate YARN and Spark, that will guarantee you have the YARN_CONF_DIR
>>> set. I am 90% sure it will fix your problem.
>>>
>>> YARN itself will not eat much resources, you should be alright and it
>>> may allow you to move forward instead of being stuck.
>>>
>>> Best,
>>> Sam
>>>
>>>
>>>
>>>
>>>
>>>
>>> Best,
>>> Samuel
>>>
>>> --
>>> Samuel Cozannet
>>> Cloud, Big Data and IoT Strategy Team
>>> Business Development - Cloud and ISV Ecosystem
>>> Changing the Future of Cloud
>>> Ubuntu   / Canonical UK LTD  /
>>> Juju 
>>> samuel.cozan...@canonical.com
>>> mob: +33 616 702 389
>>> skype: samnco
>>> Twitter: @SaMnCo_23
>>>
>>> On Fri, Jan 30, 2015 at 7:01 PM, Ken Williams  wrote:
>>>
 Hi Sam,

 Attached is my bundles.yaml file.

 Also, there is no file 'directories.sh' on my spark-master/0
 machine (see below),

 ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
 total 12
 -rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
 -rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
 -rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
 ubuntu@ip-172-31-54-245:~$


 Many thanks again your help,

 Ken


 On 30 January 2015 at 15:45, Samuel Cozannet <
 samuel.cozan...@canonical.com> wrote:

> Hey,
>
> can you send the bundle you're using (in the GUI, bottom right,
> "export" button should give you a bundles.yaml file, please send that to
> me, so I can bootstrap the same environment as you are playing with.
>
> also
> * can you let me know if you have a file /etc/profile.d/directories.sh?
> * if yes, can you execute it from your command line, then do the spark
> command again, and let me know?
>
> Thx,
> Sam
>
>
>
>
>
>
>
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 3:46 PM, Ken Williams  wrote:
>
>> Ok - I have been able to add the relation using this,
>>
>> juju add-r

Re: How best to install Spark?

2015-02-02 Thread Ken Williams
Hi Sam,

Just to confirm that deploying the spark-master and the
yarn-hdfs-master to
the same machine seems to have worked !  :-)

// use 'juju status' to find which machine yarn-hdfs-master is on
juju status
[ etc...]
// say...machine: "4"

// deploy spark-master to same machine
juju deploy --to 4 spark-master

// add relations
juju add-relation yarn-hdfs-master:resourcemanager spark-master
juju add-relation yarn-hdfs-master:namenode spark-master


// run test
root@ip-172-31-21-92:~# spark-submit --class
org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
25 --master yarn   --num-executors 3 --driver-memory 1g
--executor-memory 1g --executor-cores 1 --deploy-mode cluster
--queue thequeue /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on
classpath
15/02/02 13:40:45 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Pi is roughly 3.1405888


Many thanks again for all your help,

Best Regards,

Ken



On 30 January 2015 at 18:11, Ken Williams  wrote:

>
> Ok - Sam, I'll try this and let you know.
>
> Thanks again for all your help,
>
> Best Regards,
>
> Ken
>
>
>
> On 30 January 2015 at 18:09, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> I'll have a look asap, but probably not before Tuesday.
>>
>> This may be "my guts tell me that" but, if you have the time, try to
>> collocate YARN and Spark, that will guarantee you have the YARN_CONF_DIR
>> set. I am 90% sure it will fix your problem.
>>
>> YARN itself will not eat much resources, you should be alright and it may
>> allow you to move forward instead of being stuck.
>>
>> Best,
>> Sam
>>
>>
>>
>>
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Fri, Jan 30, 2015 at 7:01 PM, Ken Williams  wrote:
>>
>>> Hi Sam,
>>>
>>> Attached is my bundles.yaml file.
>>>
>>> Also, there is no file 'directories.sh' on my spark-master/0 machine
>>> (see below),
>>>
>>> ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
>>> total 12
>>> -rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
>>> -rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
>>> -rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
>>> ubuntu@ip-172-31-54-245:~$
>>>
>>>
>>> Many thanks again your help,
>>>
>>> Ken
>>>
>>>
>>> On 30 January 2015 at 15:45, Samuel Cozannet <
>>> samuel.cozan...@canonical.com> wrote:
>>>
 Hey,

 can you send the bundle you're using (in the GUI, bottom right,
 "export" button should give you a bundles.yaml file, please send that to
 me, so I can bootstrap the same environment as you are playing with.

 also
 * can you let me know if you have a file /etc/profile.d/directories.sh?
 * if yes, can you execute it from your command line, then do the spark
 command again, and let me know?

 Thx,
 Sam









 Best,
 Samuel

 --
 Samuel Cozannet
 Cloud, Big Data and IoT Strategy Team
 Business Development - Cloud and ISV Ecosystem
 Changing the Future of Cloud
 Ubuntu   / Canonical UK LTD  /
 Juju 
 samuel.cozan...@canonical.com
 mob: +33 616 702 389
 skype: samnco
 Twitter: @SaMnCo_23

 On Fri, Jan 30, 2015 at 3:46 PM, Ken Williams  wrote:

> Ok - I have been able to add the relation using this,
>
> juju add-relation yarn-hdfs-master:resourcemanager
> spark-master
>
> But I still cannot see a /etc/hadoop/conf directory on the
> spark-master machine
> so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR
> (below),
>
>
> root@ip-172-31-60-53:~# spark-submit --class
> org.apache.spark.examples.SparkPi --master yarn-client
> --num-executors 3 --driver-memory 1g --executor-memory 1g
> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> Exception in thread "main" java.lang.Exception: When running with
> master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set 
> in
> the environment.
> at
> org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177)
> at
> org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:81)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
> at org.

Re: How best to install Spark?

2015-01-30 Thread Ken Williams
Ok - Sam, I'll try this and let you know.

Thanks again for all your help,

Best Regards,

Ken



On 30 January 2015 at 18:09, Samuel Cozannet 
wrote:

> I'll have a look asap, but probably not before Tuesday.
>
> This may be "my guts tell me that" but, if you have the time, try to
> collocate YARN and Spark, that will guarantee you have the YARN_CONF_DIR
> set. I am 90% sure it will fix your problem.
>
> YARN itself will not eat much resources, you should be alright and it may
> allow you to move forward instead of being stuck.
>
> Best,
> Sam
>
>
>
>
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 7:01 PM, Ken Williams  wrote:
>
>> Hi Sam,
>>
>> Attached is my bundles.yaml file.
>>
>> Also, there is no file 'directories.sh' on my spark-master/0 machine
>> (see below),
>>
>> ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
>> total 12
>> -rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
>> -rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
>> -rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
>> ubuntu@ip-172-31-54-245:~$
>>
>>
>> Many thanks again your help,
>>
>> Ken
>>
>>
>> On 30 January 2015 at 15:45, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> Hey,
>>>
>>> can you send the bundle you're using (in the GUI, bottom right, "export"
>>> button should give you a bundles.yaml file, please send that to me, so I
>>> can bootstrap the same environment as you are playing with.
>>>
>>> also
>>> * can you let me know if you have a file /etc/profile.d/directories.sh?
>>> * if yes, can you execute it from your command line, then do the spark
>>> command again, and let me know?
>>>
>>> Thx,
>>> Sam
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Best,
>>> Samuel
>>>
>>> --
>>> Samuel Cozannet
>>> Cloud, Big Data and IoT Strategy Team
>>> Business Development - Cloud and ISV Ecosystem
>>> Changing the Future of Cloud
>>> Ubuntu   / Canonical UK LTD  /
>>> Juju 
>>> samuel.cozan...@canonical.com
>>> mob: +33 616 702 389
>>> skype: samnco
>>> Twitter: @SaMnCo_23
>>>
>>> On Fri, Jan 30, 2015 at 3:46 PM, Ken Williams  wrote:
>>>
 Ok - I have been able to add the relation using this,

 juju add-relation yarn-hdfs-master:resourcemanager
 spark-master

 But I still cannot see a /etc/hadoop/conf directory on the spark-master
 machine
 so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR
 (below),


 root@ip-172-31-60-53:~# spark-submit --class
 org.apache.spark.examples.SparkPi --master yarn-client
 --num-executors 3 --driver-memory 1g --executor-memory 1g
 --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
 Spark assembly has been built with Hive, including Datanucleus jars on
 classpath
 Exception in thread "main" java.lang.Exception: When running with
 master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in
 the environment.
 at
 org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177)
 at
 org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:81)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 root@ip-172-31-60-53:~#

 Should there be a /etc/hadoop/conf directory ?

 Thanks for any help,

 Ken


 On 30 January 2015 at 12:59, Samuel Cozannet <
 samuel.cozan...@canonical.com> wrote:

> Have you tried without ':master":
>
> juju add-relation yarn-hdfs-master:resourcemanager spark-master
>
> I think Spark master consumes the relationship but doesn't have to
> expose its master relationship.
>
> Rule of thumb, when a relation is non ambiguous on one of its ends,
> there is no requirement to specify it when adding it.
>
> Another option if this doesn't work is to use the GUI to create the
> relation. It will give you a dropdown of available relationships between
> entities.
>
> Let me know how it goes,
> Thx,
> Sam
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter

Re: How best to install Spark?

2015-01-30 Thread Samuel Cozannet
I'll have a look asap, but probably not before Tuesday.

This may be "my guts tell me that" but, if you have the time, try to
collocate YARN and Spark, that will guarantee you have the YARN_CONF_DIR
set. I am 90% sure it will fix your problem.

YARN itself will not eat much resources, you should be alright and it may
allow you to move forward instead of being stuck.

Best,
Sam






Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Fri, Jan 30, 2015 at 7:01 PM, Ken Williams  wrote:

> Hi Sam,
>
> Attached is my bundles.yaml file.
>
> Also, there is no file 'directories.sh' on my spark-master/0 machine
> (see below),
>
> ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
> total 12
> -rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
> -rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
> -rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
> ubuntu@ip-172-31-54-245:~$
>
>
> Many thanks again your help,
>
> Ken
>
>
> On 30 January 2015 at 15:45, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Hey,
>>
>> can you send the bundle you're using (in the GUI, bottom right, "export"
>> button should give you a bundles.yaml file, please send that to me, so I
>> can bootstrap the same environment as you are playing with.
>>
>> also
>> * can you let me know if you have a file /etc/profile.d/directories.sh?
>> * if yes, can you execute it from your command line, then do the spark
>> command again, and let me know?
>>
>> Thx,
>> Sam
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Fri, Jan 30, 2015 at 3:46 PM, Ken Williams  wrote:
>>
>>> Ok - I have been able to add the relation using this,
>>>
>>> juju add-relation yarn-hdfs-master:resourcemanager
>>> spark-master
>>>
>>> But I still cannot see a /etc/hadoop/conf directory on the spark-master
>>> machine
>>> so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR
>>> (below),
>>>
>>>
>>> root@ip-172-31-60-53:~# spark-submit --class
>>> org.apache.spark.examples.SparkPi --master yarn-client
>>> --num-executors 3 --driver-memory 1g --executor-memory 1g
>>> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
>>> Spark assembly has been built with Hive, including Datanucleus jars on
>>> classpath
>>> Exception in thread "main" java.lang.Exception: When running with master
>>> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
>>> environment.
>>> at
>>> org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177)
>>> at
>>> org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:81)
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>> root@ip-172-31-60-53:~#
>>>
>>> Should there be a /etc/hadoop/conf directory ?
>>>
>>> Thanks for any help,
>>>
>>> Ken
>>>
>>>
>>> On 30 January 2015 at 12:59, Samuel Cozannet <
>>> samuel.cozan...@canonical.com> wrote:
>>>
 Have you tried without ':master":

 juju add-relation yarn-hdfs-master:resourcemanager spark-master

 I think Spark master consumes the relationship but doesn't have to
 expose its master relationship.

 Rule of thumb, when a relation is non ambiguous on one of its ends,
 there is no requirement to specify it when adding it.

 Another option if this doesn't work is to use the GUI to create the
 relation. It will give you a dropdown of available relationships between
 entities.

 Let me know how it goes,
 Thx,
 Sam


 Best,
 Samuel

 --
 Samuel Cozannet
 Cloud, Big Data and IoT Strategy Team
 Business Development - Cloud and ISV Ecosystem
 Changing the Future of Cloud
 Ubuntu   / Canonical UK LTD  /
 Juju 
 samuel.cozan...@canonical.com
 mob: +33 616 702 389
 skype: samnco
 Twitter: @SaMnCo_23

 On Fri, Jan 30, 2015 at 1:09 PM, Ken Williams  wrote:

> Hi Sam,
>
> I understand what you are saying but when I try to add the 2
> relations I get this error,
>
> root@adminuser-VirtualBox:~# juju add-relation
> yarn-hdfs-master:resourcemanager spark-master:master
> ERROR no r

Re: How best to install Spark?

2015-01-30 Thread Ken Williams
Hi Sam,

Attached is my bundles.yaml file.

Also, there is no file 'directories.sh' on my spark-master/0 machine
(see below),

ubuntu@ip-172-31-54-245:~$ ls -l /etc/profile.d/
total 12
-rw-r--r-- 1 root root 1559 Jul 29  2014 Z97-byobu.sh
-rwxr-xr-x 1 root root 2691 Oct  6 13:19 Z99-cloud-locale-test.sh
-rw-r--r-- 1 root root  663 Apr  7  2014 bash_completion.sh
ubuntu@ip-172-31-54-245:~$


Many thanks again your help,

Ken


On 30 January 2015 at 15:45, Samuel Cozannet 
wrote:

> Hey,
>
> can you send the bundle you're using (in the GUI, bottom right, "export"
> button should give you a bundles.yaml file, please send that to me, so I
> can bootstrap the same environment as you are playing with.
>
> also
> * can you let me know if you have a file /etc/profile.d/directories.sh?
> * if yes, can you execute it from your command line, then do the spark
> command again, and let me know?
>
> Thx,
> Sam
>
>
>
>
>
>
>
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 3:46 PM, Ken Williams  wrote:
>
>> Ok - I have been able to add the relation using this,
>>
>> juju add-relation yarn-hdfs-master:resourcemanager
>> spark-master
>>
>> But I still cannot see a /etc/hadoop/conf directory on the spark-master
>> machine
>> so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR
>> (below),
>>
>>
>> root@ip-172-31-60-53:~# spark-submit --class
>> org.apache.spark.examples.SparkPi --master yarn-client
>> --num-executors 3 --driver-memory 1g --executor-memory 1g
>> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
>> Spark assembly has been built with Hive, including Datanucleus jars on
>> classpath
>> Exception in thread "main" java.lang.Exception: When running with master
>> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
>> environment.
>> at
>> org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177)
>> at
>> org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:81)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>> root@ip-172-31-60-53:~#
>>
>> Should there be a /etc/hadoop/conf directory ?
>>
>> Thanks for any help,
>>
>> Ken
>>
>>
>> On 30 January 2015 at 12:59, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> Have you tried without ':master":
>>>
>>> juju add-relation yarn-hdfs-master:resourcemanager spark-master
>>>
>>> I think Spark master consumes the relationship but doesn't have to
>>> expose its master relationship.
>>>
>>> Rule of thumb, when a relation is non ambiguous on one of its ends,
>>> there is no requirement to specify it when adding it.
>>>
>>> Another option if this doesn't work is to use the GUI to create the
>>> relation. It will give you a dropdown of available relationships between
>>> entities.
>>>
>>> Let me know how it goes,
>>> Thx,
>>> Sam
>>>
>>>
>>> Best,
>>> Samuel
>>>
>>> --
>>> Samuel Cozannet
>>> Cloud, Big Data and IoT Strategy Team
>>> Business Development - Cloud and ISV Ecosystem
>>> Changing the Future of Cloud
>>> Ubuntu   / Canonical UK LTD  /
>>> Juju 
>>> samuel.cozan...@canonical.com
>>> mob: +33 616 702 389
>>> skype: samnco
>>> Twitter: @SaMnCo_23
>>>
>>> On Fri, Jan 30, 2015 at 1:09 PM, Ken Williams  wrote:
>>>
 Hi Sam,

 I understand what you are saying but when I try to add the 2
 relations I get this error,

 root@adminuser-VirtualBox:~# juju add-relation
 yarn-hdfs-master:resourcemanager spark-master:master
 ERROR no relations found
 root@adminuser-VirtualBox:~# juju add-relation
 yarn-hdfs-master:namenode spark-master:master
 ERROR no relations found

   Am I adding the relations right ?

   Attached is my 'juju status' file.

   Thanks for all your help,

 Ken





 On 30 January 2015 at 11:16, Samuel Cozannet <
 samuel.cozan...@canonical.com> wrote:

> Hey Ken,
>
> Yes, you need to create the relationship between the 2 entities to
> they know about each other.
>
> Looking at the list of hooks for the charm
>  you
> can see there are 2 hooks named namenode-relation-changed
> 
>  and resourcemanager-relation-changed
> 

Re: How best to install Spark?

2015-01-30 Thread Samuel Cozannet
Hey,

can you send the bundle you're using (in the GUI, bottom right, "export"
button should give you a bundles.yaml file, please send that to me, so I
can bootstrap the same environment as you are playing with.

also
* can you let me know if you have a file /etc/profile.d/directories.sh?
* if yes, can you execute it from your command line, then do the spark
command again, and let me know?

Thx,
Sam









Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Fri, Jan 30, 2015 at 3:46 PM, Ken Williams  wrote:

> Ok - I have been able to add the relation using this,
>
> juju add-relation yarn-hdfs-master:resourcemanager
> spark-master
>
> But I still cannot see a /etc/hadoop/conf directory on the spark-master
> machine
> so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR
> (below),
>
>
> root@ip-172-31-60-53:~# spark-submit --class
> org.apache.spark.examples.SparkPi --master yarn-client
> --num-executors 3 --driver-memory 1g --executor-memory 1g
> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> Exception in thread "main" java.lang.Exception: When running with master
> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
> environment.
> at
> org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177)
> at
> org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:81)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> root@ip-172-31-60-53:~#
>
> Should there be a /etc/hadoop/conf directory ?
>
> Thanks for any help,
>
> Ken
>
>
> On 30 January 2015 at 12:59, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Have you tried without ':master":
>>
>> juju add-relation yarn-hdfs-master:resourcemanager spark-master
>>
>> I think Spark master consumes the relationship but doesn't have to expose
>> its master relationship.
>>
>> Rule of thumb, when a relation is non ambiguous on one of its ends, there
>> is no requirement to specify it when adding it.
>>
>> Another option if this doesn't work is to use the GUI to create the
>> relation. It will give you a dropdown of available relationships between
>> entities.
>>
>> Let me know how it goes,
>> Thx,
>> Sam
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Fri, Jan 30, 2015 at 1:09 PM, Ken Williams  wrote:
>>
>>> Hi Sam,
>>>
>>> I understand what you are saying but when I try to add the 2
>>> relations I get this error,
>>>
>>> root@adminuser-VirtualBox:~# juju add-relation
>>> yarn-hdfs-master:resourcemanager spark-master:master
>>> ERROR no relations found
>>> root@adminuser-VirtualBox:~# juju add-relation
>>> yarn-hdfs-master:namenode spark-master:master
>>> ERROR no relations found
>>>
>>>   Am I adding the relations right ?
>>>
>>>   Attached is my 'juju status' file.
>>>
>>>   Thanks for all your help,
>>>
>>> Ken
>>>
>>>
>>>
>>>
>>>
>>> On 30 January 2015 at 11:16, Samuel Cozannet <
>>> samuel.cozan...@canonical.com> wrote:
>>>
 Hey Ken,

 Yes, you need to create the relationship between the 2 entities to they
 know about each other.

 Looking at the list of hooks for the charm
  you
 can see there are 2 hooks named namenode-relation-changed
 
  and resourcemanager-relation-changed
 
  which
 are related to YARN/Hadoop.
 Looking deeper in the code, you'll notice they reference a function
 found in bdutils.py called "setHadoopEnvVar()", which based on its name
 should set the HADOOP_CONF_DIR.

 There are 2 relations, so add both of them.

 Note that I didn't test this myself, but I expect this should fix the
 problem. If it doesn't please come back to us...

 Thanks!
 Sam


 Best,
 Samuel

 --
 Samuel Cozannet
 Cloud, Big Data and IoT Strategy Team
 Business Development - Cloud and ISV Ecosystem
 Changing the Future of Cloud
 Ubuntu 

Re: How best to install Spark?

2015-01-30 Thread Ken Williams
Ok - I have been able to add the relation using this,

juju add-relation yarn-hdfs-master:resourcemanager
spark-master

But I still cannot see a /etc/hadoop/conf directory on the spark-master
machine
so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR
(below),


root@ip-172-31-60-53:~# spark-submit --class
org.apache.spark.examples.SparkPi --master yarn-client
--num-executors 3 --driver-memory 1g --executor-memory 1g
--executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
Spark assembly has been built with Hive, including Datanucleus jars on
classpath
Exception in thread "main" java.lang.Exception: When running with master
'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
environment.
at
org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177)
at
org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:81)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
root@ip-172-31-60-53:~#

Should there be a /etc/hadoop/conf directory ?

Thanks for any help,

Ken


On 30 January 2015 at 12:59, Samuel Cozannet 
wrote:

> Have you tried without ':master":
>
> juju add-relation yarn-hdfs-master:resourcemanager spark-master
>
> I think Spark master consumes the relationship but doesn't have to expose
> its master relationship.
>
> Rule of thumb, when a relation is non ambiguous on one of its ends, there
> is no requirement to specify it when adding it.
>
> Another option if this doesn't work is to use the GUI to create the
> relation. It will give you a dropdown of available relationships between
> entities.
>
> Let me know how it goes,
> Thx,
> Sam
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 1:09 PM, Ken Williams  wrote:
>
>> Hi Sam,
>>
>> I understand what you are saying but when I try to add the 2
>> relations I get this error,
>>
>> root@adminuser-VirtualBox:~# juju add-relation
>> yarn-hdfs-master:resourcemanager spark-master:master
>> ERROR no relations found
>> root@adminuser-VirtualBox:~# juju add-relation yarn-hdfs-master:namenode
>> spark-master:master
>> ERROR no relations found
>>
>>   Am I adding the relations right ?
>>
>>   Attached is my 'juju status' file.
>>
>>   Thanks for all your help,
>>
>> Ken
>>
>>
>>
>>
>>
>> On 30 January 2015 at 11:16, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> Hey Ken,
>>>
>>> Yes, you need to create the relationship between the 2 entities to they
>>> know about each other.
>>>
>>> Looking at the list of hooks for the charm
>>>  you can
>>> see there are 2 hooks named namenode-relation-changed
>>> 
>>>  and resourcemanager-relation-changed
>>> 
>>>  which
>>> are related to YARN/Hadoop.
>>> Looking deeper in the code, you'll notice they reference a function
>>> found in bdutils.py called "setHadoopEnvVar()", which based on its name
>>> should set the HADOOP_CONF_DIR.
>>>
>>> There are 2 relations, so add both of them.
>>>
>>> Note that I didn't test this myself, but I expect this should fix the
>>> problem. If it doesn't please come back to us...
>>>
>>> Thanks!
>>> Sam
>>>
>>>
>>> Best,
>>> Samuel
>>>
>>> --
>>> Samuel Cozannet
>>> Cloud, Big Data and IoT Strategy Team
>>> Business Development - Cloud and ISV Ecosystem
>>> Changing the Future of Cloud
>>> Ubuntu   / Canonical UK LTD  /
>>> Juju 
>>> samuel.cozan...@canonical.com
>>> mob: +33 616 702 389
>>> skype: samnco
>>> Twitter: @SaMnCo_23
>>>
>>> On Fri, Jan 30, 2015 at 11:51 AM, Ken Williams  wrote:
>>>

 Thanks, Kapil - this works :-)

 I can now run the SparkPi example successfully.
 root@ip-172-31-60-53:~# spark-submit --class
 org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
 Spark assembly has been built with Hive, including Datanucleus jars on
 classpath
 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 Pi is roughly 3.14318

 root@ip-172-31-60-53:~#

 I'm now trying to run the same example with the spark-submit '--master'
 option set to either 'yarn-cluster' or 'yarn-client'
 but I keep getting the same error :

 root@

Re: How best to install Spark?

2015-01-30 Thread Samuel Cozannet
Have you tried without ':master":

juju add-relation yarn-hdfs-master:resourcemanager spark-master

I think Spark master consumes the relationship but doesn't have to expose
its master relationship.

Rule of thumb, when a relation is non ambiguous on one of its ends, there
is no requirement to specify it when adding it.

Another option if this doesn't work is to use the GUI to create the
relation. It will give you a dropdown of available relationships between
entities.

Let me know how it goes,
Thx,
Sam


Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Fri, Jan 30, 2015 at 1:09 PM, Ken Williams  wrote:

> Hi Sam,
>
> I understand what you are saying but when I try to add the 2 relations
> I get this error,
>
> root@adminuser-VirtualBox:~# juju add-relation
> yarn-hdfs-master:resourcemanager spark-master:master
> ERROR no relations found
> root@adminuser-VirtualBox:~# juju add-relation yarn-hdfs-master:namenode
> spark-master:master
> ERROR no relations found
>
>   Am I adding the relations right ?
>
>   Attached is my 'juju status' file.
>
>   Thanks for all your help,
>
> Ken
>
>
>
>
>
> On 30 January 2015 at 11:16, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Hey Ken,
>>
>> Yes, you need to create the relationship between the 2 entities to they
>> know about each other.
>>
>> Looking at the list of hooks for the charm
>>  you can
>> see there are 2 hooks named namenode-relation-changed
>> 
>>  and resourcemanager-relation-changed
>> 
>>  which
>> are related to YARN/Hadoop.
>> Looking deeper in the code, you'll notice they reference a function found
>> in bdutils.py called "setHadoopEnvVar()", which based on its name should
>> set the HADOOP_CONF_DIR.
>>
>> There are 2 relations, so add both of them.
>>
>> Note that I didn't test this myself, but I expect this should fix the
>> problem. If it doesn't please come back to us...
>>
>> Thanks!
>> Sam
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Fri, Jan 30, 2015 at 11:51 AM, Ken Williams  wrote:
>>
>>>
>>> Thanks, Kapil - this works :-)
>>>
>>> I can now run the SparkPi example successfully.
>>> root@ip-172-31-60-53:~# spark-submit --class
>>> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>> Spark assembly has been built with Hive, including Datanucleus jars on
>>> classpath
>>> 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> Pi is roughly 3.14318
>>>
>>> root@ip-172-31-60-53:~#
>>>
>>> I'm now trying to run the same example with the spark-submit '--master'
>>> option set to either 'yarn-cluster' or 'yarn-client'
>>> but I keep getting the same error :
>>>
>>> root@ip-172-31-60-53:~# spark-submit --class
>>> org.apache.spark.examples.SparkPi --master yarn-client
>>> --num-executors 3 --driver-memory 1g --executor-memory 1g
>>> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
>>> Spark assembly has been built with Hive, including Datanucleus jars on
>>> classpath
>>> Exception in thread "main" java.lang.Exception: When running with master
>>> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
>>> environment.
>>>
>>> But on my spark-master/0 machine there is no /etc/hadoop/conf directory.
>>> So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ?
>>> Do I need to add a juju relation between spark-master and ...
>>> yarn-hdfs-master to make them aware of each other ?
>>>
>>> Thanks for any help,
>>>
>>> Ken
>>>
>>>
>>>
>>>
>>>
>>> On 28 January 2015 at 19:32, Kapil Thangavelu <
>>> kapil.thangav...@canonical.com> wrote:
>>>


 On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams  wrote:

>
> Hi Sam/Amir,
>
> I've been able to 'juju ssh spark-master/0' and I successfully ran
> the two
> simple examples for pyspark and spark-shell,
>
> ./bin/pyspark
> >>> sc.parallelize(range(1000)).count()
> 1000
>
> ./bin/spark-shell
>  scala> sc.parallelize(1 to 1000).count()
> 1000
>
>

Re: How best to install Spark?

2015-01-30 Thread Ken Williams
Hi Sam,

I understand what you are saying but when I try to add the 2 relations
I get this error,

root@adminuser-VirtualBox:~# juju add-relation
yarn-hdfs-master:resourcemanager spark-master:master
ERROR no relations found
root@adminuser-VirtualBox:~# juju add-relation yarn-hdfs-master:namenode
spark-master:master
ERROR no relations found

  Am I adding the relations right ?

  Attached is my 'juju status' file.

  Thanks for all your help,

Ken





On 30 January 2015 at 11:16, Samuel Cozannet 
wrote:

> Hey Ken,
>
> Yes, you need to create the relationship between the 2 entities to they
> know about each other.
>
> Looking at the list of hooks for the charm
>  you can
> see there are 2 hooks named namenode-relation-changed
> 
>  and resourcemanager-relation-changed
> 
>  which
> are related to YARN/Hadoop.
> Looking deeper in the code, you'll notice they reference a function found
> in bdutils.py called "setHadoopEnvVar()", which based on its name should
> set the HADOOP_CONF_DIR.
>
> There are 2 relations, so add both of them.
>
> Note that I didn't test this myself, but I expect this should fix the
> problem. If it doesn't please come back to us...
>
> Thanks!
> Sam
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Fri, Jan 30, 2015 at 11:51 AM, Ken Williams  wrote:
>
>>
>> Thanks, Kapil - this works :-)
>>
>> I can now run the SparkPi example successfully.
>> root@ip-172-31-60-53:~# spark-submit --class
>> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>> Spark assembly has been built with Hive, including Datanucleus jars on
>> classpath
>> 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> Pi is roughly 3.14318
>>
>> root@ip-172-31-60-53:~#
>>
>> I'm now trying to run the same example with the spark-submit '--master'
>> option set to either 'yarn-cluster' or 'yarn-client'
>> but I keep getting the same error :
>>
>> root@ip-172-31-60-53:~# spark-submit --class
>> org.apache.spark.examples.SparkPi --master yarn-client
>> --num-executors 3 --driver-memory 1g --executor-memory 1g
>> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
>> Spark assembly has been built with Hive, including Datanucleus jars on
>> classpath
>> Exception in thread "main" java.lang.Exception: When running with master
>> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
>> environment.
>>
>> But on my spark-master/0 machine there is no /etc/hadoop/conf directory.
>> So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ?
>> Do I need to add a juju relation between spark-master and ...
>> yarn-hdfs-master to make them aware of each other ?
>>
>> Thanks for any help,
>>
>> Ken
>>
>>
>>
>>
>>
>> On 28 January 2015 at 19:32, Kapil Thangavelu <
>> kapil.thangav...@canonical.com> wrote:
>>
>>>
>>>
>>> On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams  wrote:
>>>

 Hi Sam/Amir,

 I've been able to 'juju ssh spark-master/0' and I successfully ran
 the two
 simple examples for pyspark and spark-shell,

 ./bin/pyspark
 >>> sc.parallelize(range(1000)).count()
 1000

 ./bin/spark-shell
  scala> sc.parallelize(1 to 1000).count()
 1000


 Now I want to run some of the spark examples in the spark-exampes*.jar
 file, which I have on my local machine. How do I copy the jar file from
 my local machine to the AWS machine ?

 I have tried 'scp' and 'juju scp' from the local command-line but both
 fail (below),

 root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
 ubuntu@ip-172-31-59:/tmp
 ssh: Could not resolve hostname ip-172-31-59: Name or service not known
 lost connection
 root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
 ubuntu@ip-172-31-59:/tmp
 ERROR exit status 1 (nc: getaddrinfo: Name or service not known)

 Any ideas ?

>>>
>>> juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp
>>>

 Ken











 On 28 January 2015 at 17:29, Samuel Cozannet <
 samuel.cozan...@canonical.com> wrote:

> Glad it worked!
>
> I'll make a merge request to the upstream so that it works natively
> from the store asap.
>

Re: How best to install Spark?

2015-01-30 Thread Samuel Cozannet
Hey Ken,

Yes, you need to create the relationship between the 2 entities to they
know about each other.

Looking at the list of hooks for the charm
 you can see
there are 2 hooks named namenode-relation-changed

 and resourcemanager-relation-changed

which
are related to YARN/Hadoop.
Looking deeper in the code, you'll notice they reference a function found
in bdutils.py called "setHadoopEnvVar()", which based on its name should
set the HADOOP_CONF_DIR.

There are 2 relations, so add both of them.

Note that I didn't test this myself, but I expect this should fix the
problem. If it doesn't please come back to us...

Thanks!
Sam


Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Fri, Jan 30, 2015 at 11:51 AM, Ken Williams  wrote:

>
> Thanks, Kapil - this works :-)
>
> I can now run the SparkPi example successfully.
> root@ip-172-31-60-53:~# spark-submit --class
> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Pi is roughly 3.14318
>
> root@ip-172-31-60-53:~#
>
> I'm now trying to run the same example with the spark-submit '--master'
> option set to either 'yarn-cluster' or 'yarn-client'
> but I keep getting the same error :
>
> root@ip-172-31-60-53:~# spark-submit --class
> org.apache.spark.examples.SparkPi --master yarn-client
> --num-executors 3 --driver-memory 1g --executor-memory 1g
> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> Exception in thread "main" java.lang.Exception: When running with master
> 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
> environment.
>
> But on my spark-master/0 machine there is no /etc/hadoop/conf directory.
> So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ?
> Do I need to add a juju relation between spark-master and ...
> yarn-hdfs-master to make them aware of each other ?
>
> Thanks for any help,
>
> Ken
>
>
>
>
>
> On 28 January 2015 at 19:32, Kapil Thangavelu <
> kapil.thangav...@canonical.com> wrote:
>
>>
>>
>> On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams  wrote:
>>
>>>
>>> Hi Sam/Amir,
>>>
>>> I've been able to 'juju ssh spark-master/0' and I successfully ran
>>> the two
>>> simple examples for pyspark and spark-shell,
>>>
>>> ./bin/pyspark
>>> >>> sc.parallelize(range(1000)).count()
>>> 1000
>>>
>>> ./bin/spark-shell
>>>  scala> sc.parallelize(1 to 1000).count()
>>> 1000
>>>
>>>
>>> Now I want to run some of the spark examples in the spark-exampes*.jar
>>> file, which I have on my local machine. How do I copy the jar file from
>>> my local machine to the AWS machine ?
>>>
>>> I have tried 'scp' and 'juju scp' from the local command-line but both
>>> fail (below),
>>>
>>> root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>> ubuntu@ip-172-31-59:/tmp
>>> ssh: Could not resolve hostname ip-172-31-59: Name or service not known
>>> lost connection
>>> root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>>> ubuntu@ip-172-31-59:/tmp
>>> ERROR exit status 1 (nc: getaddrinfo: Name or service not known)
>>>
>>> Any ideas ?
>>>
>>
>> juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp
>>
>>>
>>> Ken
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 28 January 2015 at 17:29, Samuel Cozannet <
>>> samuel.cozan...@canonical.com> wrote:
>>>
 Glad it worked!

 I'll make a merge request to the upstream so that it works natively
 from the store asap.

 Thanks for catching that!
 Samuel

 Best,
 Samuel

 --
 Samuel Cozannet
 Cloud, Big Data and IoT Strategy Team
 Business Development - Cloud and ISV Ecosystem
 Changing the Future of Cloud
 Ubuntu   / Canonical UK LTD  /
 Juju 
 samuel.cozan...@canonical.com
 mob: +33 616 702 389
 skype: samnco
 Twitter: @SaMnCo_23

 On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams  wrote:

>
> Hi Sam (and Maarten),
>
> Cloning Spark 1.2.0 from github seems to have worked!
> I can install the Spark examples afterwards.
>
> T

Re: How best to install Spark?

2015-01-30 Thread Ken Williams
Thanks, Kapil - this works :-)

I can now run the SparkPi example successfully.
root@ip-172-31-60-53:~# spark-submit --class
org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on
classpath
15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Pi is roughly 3.14318

root@ip-172-31-60-53:~#

I'm now trying to run the same example with the spark-submit '--master'
option set to either 'yarn-cluster' or 'yarn-client'
but I keep getting the same error :

root@ip-172-31-60-53:~# spark-submit --class
org.apache.spark.examples.SparkPi --master yarn-client
--num-executors 3 --driver-memory 1g --executor-memory 1g
--executor-cores 1 --queue thequeue lib/spark-examples*.jar 10
Spark assembly has been built with Hive, including Datanucleus jars on
classpath
Exception in thread "main" java.lang.Exception: When running with master
'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the
environment.

But on my spark-master/0 machine there is no /etc/hadoop/conf directory.
So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ?
Do I need to add a juju relation between spark-master and ...
yarn-hdfs-master to make them aware of each other ?

Thanks for any help,

Ken





On 28 January 2015 at 19:32, Kapil Thangavelu <
kapil.thangav...@canonical.com> wrote:

>
>
> On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams  wrote:
>
>>
>> Hi Sam/Amir,
>>
>> I've been able to 'juju ssh spark-master/0' and I successfully ran
>> the two
>> simple examples for pyspark and spark-shell,
>>
>> ./bin/pyspark
>> >>> sc.parallelize(range(1000)).count()
>> 1000
>>
>> ./bin/spark-shell
>>  scala> sc.parallelize(1 to 1000).count()
>> 1000
>>
>>
>> Now I want to run some of the spark examples in the spark-exampes*.jar
>> file, which I have on my local machine. How do I copy the jar file from
>> my local machine to the AWS machine ?
>>
>> I have tried 'scp' and 'juju scp' from the local command-line but both
>> fail (below),
>>
>> root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>> ubuntu@ip-172-31-59:/tmp
>> ssh: Could not resolve hostname ip-172-31-59: Name or service not known
>> lost connection
>> root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
>> ubuntu@ip-172-31-59:/tmp
>> ERROR exit status 1 (nc: getaddrinfo: Name or service not known)
>>
>> Any ideas ?
>>
>
> juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp
>
>>
>> Ken
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On 28 January 2015 at 17:29, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> Glad it worked!
>>>
>>> I'll make a merge request to the upstream so that it works natively from
>>> the store asap.
>>>
>>> Thanks for catching that!
>>> Samuel
>>>
>>> Best,
>>> Samuel
>>>
>>> --
>>> Samuel Cozannet
>>> Cloud, Big Data and IoT Strategy Team
>>> Business Development - Cloud and ISV Ecosystem
>>> Changing the Future of Cloud
>>> Ubuntu   / Canonical UK LTD  /
>>> Juju 
>>> samuel.cozan...@canonical.com
>>> mob: +33 616 702 389
>>> skype: samnco
>>> Twitter: @SaMnCo_23
>>>
>>> On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams  wrote:
>>>

 Hi Sam (and Maarten),

 Cloning Spark 1.2.0 from github seems to have worked!
 I can install the Spark examples afterwards.

 Thanks for all your help!

 Yes - Andrew and Angie both say 'hi'  :-)

 Best Regards,

 Ken


 On 28 January 2015 at 16:43, Samuel Cozannet <
 samuel.cozan...@canonical.com> wrote:

> Hey Ken,
>
> So I had a closer look to your Spark problem and found out what went
> wrong.
>
> The charm available on the charmstore is trying to download Spark
> 1.0.2, and the versions available on the Apache website are 1.1.0, 1.1.1
> and 1.2.0.
>
> There is another version of the charm available on GitHub that
> actually will deploy 1.2.0
>
> 1. On your computer, the below folders & get there:
>
> cd ~
> mkdir charms
> mkdir charms/trusty
> cd charms/trusty
>
> 2. Branch the Spark charm.
>
> git clone https://github.com/Archethought/spark-charm spark
>
> 3. Deploy Spark from local repository
>
> juju deploy --repository=~/charms local:trusty/spark spark-master
> juju deploy --repository=~/charms local:trusty/spark spark-slave
> juju add-relation spark-master:master spark-slave:slave
>
> Worked on AWS for me just minutes ago. Let me know how it goes for
> you. Note that this version of the charm does NOT install the Spark
> examples. The files are present though, so you'll find them in
> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive
>

Re: How best to install Spark?

2015-01-28 Thread Kapil Thangavelu
On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams  wrote:

>
> Hi Sam/Amir,
>
> I've been able to 'juju ssh spark-master/0' and I successfully ran the
> two
> simple examples for pyspark and spark-shell,
>
> ./bin/pyspark
> >>> sc.parallelize(range(1000)).count()
> 1000
>
> ./bin/spark-shell
>  scala> sc.parallelize(1 to 1000).count()
> 1000
>
>
> Now I want to run some of the spark examples in the spark-exampes*.jar
> file, which I have on my local machine. How do I copy the jar file from
> my local machine to the AWS machine ?
>
> I have tried 'scp' and 'juju scp' from the local command-line but both
> fail (below),
>
> root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
> ubuntu@ip-172-31-59:/tmp
> ssh: Could not resolve hostname ip-172-31-59: Name or service not known
> lost connection
> root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
> ubuntu@ip-172-31-59:/tmp
> ERROR exit status 1 (nc: getaddrinfo: Name or service not known)
>
> Any ideas ?
>

juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp

>
> Ken
>
>
>
>
>
>
>
>
>
>
>
> On 28 January 2015 at 17:29, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Glad it worked!
>>
>> I'll make a merge request to the upstream so that it works natively from
>> the store asap.
>>
>> Thanks for catching that!
>> Samuel
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams  wrote:
>>
>>>
>>> Hi Sam (and Maarten),
>>>
>>> Cloning Spark 1.2.0 from github seems to have worked!
>>> I can install the Spark examples afterwards.
>>>
>>> Thanks for all your help!
>>>
>>> Yes - Andrew and Angie both say 'hi'  :-)
>>>
>>> Best Regards,
>>>
>>> Ken
>>>
>>>
>>> On 28 January 2015 at 16:43, Samuel Cozannet <
>>> samuel.cozan...@canonical.com> wrote:
>>>
 Hey Ken,

 So I had a closer look to your Spark problem and found out what went
 wrong.

 The charm available on the charmstore is trying to download Spark
 1.0.2, and the versions available on the Apache website are 1.1.0, 1.1.1
 and 1.2.0.

 There is another version of the charm available on GitHub that actually
 will deploy 1.2.0

 1. On your computer, the below folders & get there:

 cd ~
 mkdir charms
 mkdir charms/trusty
 cd charms/trusty

 2. Branch the Spark charm.

 git clone https://github.com/Archethought/spark-charm spark

 3. Deploy Spark from local repository

 juju deploy --repository=~/charms local:trusty/spark spark-master
 juju deploy --repository=~/charms local:trusty/spark spark-slave
 juju add-relation spark-master:master spark-slave:slave

 Worked on AWS for me just minutes ago. Let me know how it goes for you.
 Note that this version of the charm does NOT install the Spark examples.
 The files are present though, so you'll find them in
 /var/lib/juju/agents/unit-spark-master-0/charm/files/archive

 Hope that helps...
 Let me know if it works for you!

 Best,
 Sam


 Best,
 Samuel

 --
 Samuel Cozannet
 Cloud, Big Data and IoT Strategy Team
 Business Development - Cloud and ISV Ecosystem
 Changing the Future of Cloud
 Ubuntu   / Canonical UK LTD  /
 Juju 
 samuel.cozan...@canonical.com
 mob: +33 616 702 389
 skype: samnco
 Twitter: @SaMnCo_23

 On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams  wrote:

>
> Hi folks,
>
> I'm completely new to juju so any help is appreciated.
>
> I'm trying to create a hadoop/analytics-type platform.
>
> I've managed to install the 'data-analytics-with-sql-like' bundle
> (using this command)
>
> juju quickstart
> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like
>
> This is very impressive, and gives me virtually everything that I want
> (hadoop, hive, etc) - but I also need Spark.
>
> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark)
> and bundle (
> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
> however do not seem stable or available and I can't figure out how to
> install them.
>
> Should I just download and install the Spark tar-ball on the nodes
> in my AWS cluster, or is there a better way to do this ?
>
> Thanks in advance,
>
> Ken
>
>
> --
> Juju mailing list
> Juju@lists.ubuntu.com
> Modify settings or unsubs

Re: How best to install Spark?

2015-01-28 Thread Ken Williams
Hi Sam/Amir,

I've been able to 'juju ssh spark-master/0' and I successfully ran the
two
simple examples for pyspark and spark-shell,

./bin/pyspark
>>> sc.parallelize(range(1000)).count()
1000

./bin/spark-shell
 scala> sc.parallelize(1 to 1000).count()
1000


Now I want to run some of the spark examples in the spark-exampes*.jar
file, which I have on my local machine. How do I copy the jar file from
my local machine to the AWS machine ?

I have tried 'scp' and 'juju scp' from the local command-line but both fail
(below),

root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
ubuntu@ip-172-31-59:/tmp
ssh: Could not resolve hostname ip-172-31-59: Name or service not known
lost connection
root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar
ubuntu@ip-172-31-59:/tmp
ERROR exit status 1 (nc: getaddrinfo: Name or service not known)

Any ideas ?

Ken











On 28 January 2015 at 17:29, Samuel Cozannet 
wrote:

> Glad it worked!
>
> I'll make a merge request to the upstream so that it works natively from
> the store asap.
>
> Thanks for catching that!
> Samuel
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams  wrote:
>
>>
>> Hi Sam (and Maarten),
>>
>> Cloning Spark 1.2.0 from github seems to have worked!
>> I can install the Spark examples afterwards.
>>
>> Thanks for all your help!
>>
>> Yes - Andrew and Angie both say 'hi'  :-)
>>
>> Best Regards,
>>
>> Ken
>>
>>
>> On 28 January 2015 at 16:43, Samuel Cozannet <
>> samuel.cozan...@canonical.com> wrote:
>>
>>> Hey Ken,
>>>
>>> So I had a closer look to your Spark problem and found out what went
>>> wrong.
>>>
>>> The charm available on the charmstore is trying to download Spark 1.0.2,
>>> and the versions available on the Apache website are 1.1.0, 1.1.1 and
>>> 1.2.0.
>>>
>>> There is another version of the charm available on GitHub that actually
>>> will deploy 1.2.0
>>>
>>> 1. On your computer, the below folders & get there:
>>>
>>> cd ~
>>> mkdir charms
>>> mkdir charms/trusty
>>> cd charms/trusty
>>>
>>> 2. Branch the Spark charm.
>>>
>>> git clone https://github.com/Archethought/spark-charm spark
>>>
>>> 3. Deploy Spark from local repository
>>>
>>> juju deploy --repository=~/charms local:trusty/spark spark-master
>>> juju deploy --repository=~/charms local:trusty/spark spark-slave
>>> juju add-relation spark-master:master spark-slave:slave
>>>
>>> Worked on AWS for me just minutes ago. Let me know how it goes for you.
>>> Note that this version of the charm does NOT install the Spark examples.
>>> The files are present though, so you'll find them in
>>> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive
>>>
>>> Hope that helps...
>>> Let me know if it works for you!
>>>
>>> Best,
>>> Sam
>>>
>>>
>>> Best,
>>> Samuel
>>>
>>> --
>>> Samuel Cozannet
>>> Cloud, Big Data and IoT Strategy Team
>>> Business Development - Cloud and ISV Ecosystem
>>> Changing the Future of Cloud
>>> Ubuntu   / Canonical UK LTD  /
>>> Juju 
>>> samuel.cozan...@canonical.com
>>> mob: +33 616 702 389
>>> skype: samnco
>>> Twitter: @SaMnCo_23
>>>
>>> On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams  wrote:
>>>

 Hi folks,

 I'm completely new to juju so any help is appreciated.

 I'm trying to create a hadoop/analytics-type platform.

 I've managed to install the 'data-analytics-with-sql-like' bundle
 (using this command)

 juju quickstart
 bundle:data-analytics-with-sql-like/data-analytics-with-sql-like

 This is very impressive, and gives me virtually everything that I want
 (hadoop, hive, etc) - but I also need Spark.

 The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark)
 and bundle (
 http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
 however do not seem stable or available and I can't figure out how to
 install them.

 Should I just download and install the Spark tar-ball on the nodes
 in my AWS cluster, or is there a better way to do this ?

 Thanks in advance,

 Ken


 --
 Juju mailing list
 Juju@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju


>>>
>>
>
-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju


Re: How best to install Spark?

2015-01-28 Thread Samuel Cozannet
Glad it worked!

I'll make a merge request to the upstream so that it works natively from
the store asap.

Thanks for catching that!
Samuel

Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams  wrote:

>
> Hi Sam (and Maarten),
>
> Cloning Spark 1.2.0 from github seems to have worked!
> I can install the Spark examples afterwards.
>
> Thanks for all your help!
>
> Yes - Andrew and Angie both say 'hi'  :-)
>
> Best Regards,
>
> Ken
>
>
> On 28 January 2015 at 16:43, Samuel Cozannet <
> samuel.cozan...@canonical.com> wrote:
>
>> Hey Ken,
>>
>> So I had a closer look to your Spark problem and found out what went
>> wrong.
>>
>> The charm available on the charmstore is trying to download Spark 1.0.2,
>> and the versions available on the Apache website are 1.1.0, 1.1.1 and
>> 1.2.0.
>>
>> There is another version of the charm available on GitHub that actually
>> will deploy 1.2.0
>>
>> 1. On your computer, the below folders & get there:
>>
>> cd ~
>> mkdir charms
>> mkdir charms/trusty
>> cd charms/trusty
>>
>> 2. Branch the Spark charm.
>>
>> git clone https://github.com/Archethought/spark-charm spark
>>
>> 3. Deploy Spark from local repository
>>
>> juju deploy --repository=~/charms local:trusty/spark spark-master
>> juju deploy --repository=~/charms local:trusty/spark spark-slave
>> juju add-relation spark-master:master spark-slave:slave
>>
>> Worked on AWS for me just minutes ago. Let me know how it goes for you.
>> Note that this version of the charm does NOT install the Spark examples.
>> The files are present though, so you'll find them in
>> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive
>>
>> Hope that helps...
>> Let me know if it works for you!
>>
>> Best,
>> Sam
>>
>>
>> Best,
>> Samuel
>>
>> --
>> Samuel Cozannet
>> Cloud, Big Data and IoT Strategy Team
>> Business Development - Cloud and ISV Ecosystem
>> Changing the Future of Cloud
>> Ubuntu   / Canonical UK LTD  /
>> Juju 
>> samuel.cozan...@canonical.com
>> mob: +33 616 702 389
>> skype: samnco
>> Twitter: @SaMnCo_23
>>
>> On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams  wrote:
>>
>>>
>>> Hi folks,
>>>
>>> I'm completely new to juju so any help is appreciated.
>>>
>>> I'm trying to create a hadoop/analytics-type platform.
>>>
>>> I've managed to install the 'data-analytics-with-sql-like' bundle
>>> (using this command)
>>>
>>> juju quickstart
>>> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like
>>>
>>> This is very impressive, and gives me virtually everything that I want
>>> (hadoop, hive, etc) - but I also need Spark.
>>>
>>> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark)
>>> and bundle (
>>> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
>>> however do not seem stable or available and I can't figure out how to
>>> install them.
>>>
>>> Should I just download and install the Spark tar-ball on the nodes
>>> in my AWS cluster, or is there a better way to do this ?
>>>
>>> Thanks in advance,
>>>
>>> Ken
>>>
>>>
>>> --
>>> Juju mailing list
>>> Juju@lists.ubuntu.com
>>> Modify settings or unsubscribe at:
>>> https://lists.ubuntu.com/mailman/listinfo/juju
>>>
>>>
>>
>
-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju


Re: How best to install Spark?

2015-01-28 Thread Ken Williams
Hi Sam (and Maarten),

Cloning Spark 1.2.0 from github seems to have worked!
I can install the Spark examples afterwards.

Thanks for all your help!

Yes - Andrew and Angie both say 'hi'  :-)

Best Regards,

Ken


On 28 January 2015 at 16:43, Samuel Cozannet 
wrote:

> Hey Ken,
>
> So I had a closer look to your Spark problem and found out what went wrong.
>
> The charm available on the charmstore is trying to download Spark 1.0.2,
> and the versions available on the Apache website are 1.1.0, 1.1.1 and
> 1.2.0.
>
> There is another version of the charm available on GitHub that actually
> will deploy 1.2.0
>
> 1. On your computer, the below folders & get there:
>
> cd ~
> mkdir charms
> mkdir charms/trusty
> cd charms/trusty
>
> 2. Branch the Spark charm.
>
> git clone https://github.com/Archethought/spark-charm spark
>
> 3. Deploy Spark from local repository
>
> juju deploy --repository=~/charms local:trusty/spark spark-master
> juju deploy --repository=~/charms local:trusty/spark spark-slave
> juju add-relation spark-master:master spark-slave:slave
>
> Worked on AWS for me just minutes ago. Let me know how it goes for you.
> Note that this version of the charm does NOT install the Spark examples.
> The files are present though, so you'll find them in
> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive
>
> Hope that helps...
> Let me know if it works for you!
>
> Best,
> Sam
>
>
> Best,
> Samuel
>
> --
> Samuel Cozannet
> Cloud, Big Data and IoT Strategy Team
> Business Development - Cloud and ISV Ecosystem
> Changing the Future of Cloud
> Ubuntu   / Canonical UK LTD  /
> Juju 
> samuel.cozan...@canonical.com
> mob: +33 616 702 389
> skype: samnco
> Twitter: @SaMnCo_23
>
> On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams  wrote:
>
>>
>> Hi folks,
>>
>> I'm completely new to juju so any help is appreciated.
>>
>> I'm trying to create a hadoop/analytics-type platform.
>>
>> I've managed to install the 'data-analytics-with-sql-like' bundle
>> (using this command)
>>
>> juju quickstart
>> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like
>>
>> This is very impressive, and gives me virtually everything that I want
>> (hadoop, hive, etc) - but I also need Spark.
>>
>> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark)
>> and bundle (
>> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
>> however do not seem stable or available and I can't figure out how to
>> install them.
>>
>> Should I just download and install the Spark tar-ball on the nodes
>> in my AWS cluster, or is there a better way to do this ?
>>
>> Thanks in advance,
>>
>> Ken
>>
>>
>> --
>> Juju mailing list
>> Juju@lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/juju
>>
>>
>
-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju


Re: How best to install Spark?

2015-01-28 Thread Samuel Cozannet
Hey Ken,

So I had a closer look to your Spark problem and found out what went wrong.

The charm available on the charmstore is trying to download Spark 1.0.2,
and the versions available on the Apache website are 1.1.0, 1.1.1 and
1.2.0.

There is another version of the charm available on GitHub that actually
will deploy 1.2.0

1. On your computer, the below folders & get there:

cd ~
mkdir charms
mkdir charms/trusty
cd charms/trusty

2. Branch the Spark charm.

git clone https://github.com/Archethought/spark-charm spark

3. Deploy Spark from local repository

juju deploy --repository=~/charms local:trusty/spark spark-master
juju deploy --repository=~/charms local:trusty/spark spark-slave
juju add-relation spark-master:master spark-slave:slave

Worked on AWS for me just minutes ago. Let me know how it goes for you.
Note that this version of the charm does NOT install the Spark examples.
The files are present though, so you'll find them in
/var/lib/juju/agents/unit-spark-master-0/charm/files/archive

Hope that helps...
Let me know if it works for you!

Best,
Sam


Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams  wrote:

>
> Hi folks,
>
> I'm completely new to juju so any help is appreciated.
>
> I'm trying to create a hadoop/analytics-type platform.
>
> I've managed to install the 'data-analytics-with-sql-like' bundle
> (using this command)
>
> juju quickstart
> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like
>
> This is very impressive, and gives me virtually everything that I want
> (hadoop, hive, etc) - but I also need Spark.
>
> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark)
> and bundle (
> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
> however do not seem stable or available and I can't figure out how to
> install them.
>
> Should I just download and install the Spark tar-ball on the nodes
> in my AWS cluster, or is there a better way to do this ?
>
> Thanks in advance,
>
> Ken
>
>
> --
> Juju mailing list
> Juju@lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju
>
>
-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju


Re: How best to install Spark?

2015-01-28 Thread Samuel Cozannet
Hi Ken!

Good to know you like our charms and bundles! Are you working with Andrew &
Angie?

I have been talking with them several times, so I have a little bit of
background on your use cases. Let me know if you want to do a short hangout
to discuss your specific workload.

Specifically, if you want to use Spark in conjunction with Hadoop, you
probably want to deploy it on the same node as your YARN master. So
assuming you deployed it and named it yarn-master, you can do: (install jq
first with "sudo apt-get install jq")

 TARGET_MACHINE=$(juju stat | python -c 'import sys, yaml, json;
json.dump(yaml.load(sys.stdin), sys.stdout, indent=4)' | jq
'.services."yarn-master".units."yarn-master/0".machine' | tr -d "\"" )

==> This command will output the ID of the machine running the YARN master.

Then

juju deploy --to $TARGET_MACHINE cs:~asanjar/trusty/spark spark-master

Then you'll be able to read from Hadoop into Spark.



Best,
Samuel





Best,
Samuel

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu   / Canonical UK LTD  / Juju

samuel.cozan...@canonical.com
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23

On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams  wrote:

>
> Hi folks,
>
> I'm completely new to juju so any help is appreciated.
>
> I'm trying to create a hadoop/analytics-type platform.
>
> I've managed to install the 'data-analytics-with-sql-like' bundle
> (using this command)
>
> juju quickstart
> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like
>
> This is very impressive, and gives me virtually everything that I want
> (hadoop, hive, etc) - but I also need Spark.
>
> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark)
> and bundle (
> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster)
> however do not seem stable or available and I can't figure out how to
> install them.
>
> Should I just download and install the Spark tar-ball on the nodes
> in my AWS cluster, or is there a better way to do this ?
>
> Thanks in advance,
>
> Ken
>
>
> --
> Juju mailing list
> Juju@lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju
>
>
-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju