Ok - I have been able to add the relation using this, juju add-relation yarn-hdfs-master:resourcemanager spark-master
But I still cannot see a /etc/hadoop/conf directory on the spark-master machine so I still get the same error about HADOOP_CONF_DIR and YARN_CONF_DIR (below), root@ip-172-31-60-53:~# spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 1g --executor-memory 1g --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread "main" java.lang.Exception: When running with master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment. at org.apache.spark.deploy.SparkSubmitArguments.checkRequiredArguments(SparkSubmitArguments.scala:177) at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:81) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) root@ip-172-31-60-53:~# Should there be a /etc/hadoop/conf directory ? Thanks for any help, Ken On 30 January 2015 at 12:59, Samuel Cozannet <samuel.cozan...@canonical.com> wrote: > Have you tried without ':master": > > juju add-relation yarn-hdfs-master:resourcemanager spark-master > > I think Spark master consumes the relationship but doesn't have to expose > its master relationship. > > Rule of thumb, when a relation is non ambiguous on one of its ends, there > is no requirement to specify it when adding it. > > Another option if this doesn't work is to use the GUI to create the > relation. It will give you a dropdown of available relationships between > entities. > > Let me know how it goes, > Thx, > Sam > > > Best, > Samuel > > -- > Samuel Cozannet > Cloud, Big Data and IoT Strategy Team > Business Development - Cloud and ISV Ecosystem > Changing the Future of Cloud > Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> / > Juju <https://jujucharms.com> > samuel.cozan...@canonical.com > mob: +33 616 702 389 > skype: samnco > Twitter: @SaMnCo_23 > > On Fri, Jan 30, 2015 at 1:09 PM, Ken Williams <ke...@theasi.co> wrote: > >> Hi Sam, >> >> I understand what you are saying but when I try to add the 2 >> relations I get this error, >> >> root@adminuser-VirtualBox:~# juju add-relation >> yarn-hdfs-master:resourcemanager spark-master:master >> ERROR no relations found >> root@adminuser-VirtualBox:~# juju add-relation yarn-hdfs-master:namenode >> spark-master:master >> ERROR no relations found >> >> Am I adding the relations right ? >> >> Attached is my 'juju status' file. >> >> Thanks for all your help, >> >> Ken >> >> >> >> >> >> On 30 January 2015 at 11:16, Samuel Cozannet < >> samuel.cozan...@canonical.com> wrote: >> >>> Hey Ken, >>> >>> Yes, you need to create the relationship between the 2 entities to they >>> know about each other. >>> >>> Looking at the list of hooks for the charm >>> <https://github.com/Archethought/spark-charm/tree/master/hooks> you can >>> see there are 2 hooks named namenode-relation-changed >>> <https://github.com/Archethought/spark-charm/blob/master/hooks/namenode-relation-changed> >>> and resourcemanager-relation-changed >>> <https://github.com/Archethought/spark-charm/blob/master/hooks/resourcemanager-relation-changed> >>> which >>> are related to YARN/Hadoop. >>> Looking deeper in the code, you'll notice they reference a function >>> found in bdutils.py called "setHadoopEnvVar()", which based on its name >>> should set the HADOOP_CONF_DIR. >>> >>> There are 2 relations, so add both of them. >>> >>> Note that I didn't test this myself, but I expect this should fix the >>> problem. If it doesn't please come back to us... >>> >>> Thanks! >>> Sam >>> >>> >>> Best, >>> Samuel >>> >>> -- >>> Samuel Cozannet >>> Cloud, Big Data and IoT Strategy Team >>> Business Development - Cloud and ISV Ecosystem >>> Changing the Future of Cloud >>> Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> / >>> Juju <https://jujucharms.com> >>> samuel.cozan...@canonical.com >>> mob: +33 616 702 389 >>> skype: samnco >>> Twitter: @SaMnCo_23 >>> >>> On Fri, Jan 30, 2015 at 11:51 AM, Ken Williams <ke...@theasi.co> wrote: >>> >>>> >>>> Thanks, Kapil - this works :-) >>>> >>>> I can now run the SparkPi example successfully. >>>> root@ip-172-31-60-53:~# spark-submit --class >>>> org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar >>>> Spark assembly has been built with Hive, including Datanucleus jars on >>>> classpath >>>> 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop >>>> library for your platform... using builtin-java classes where applicable >>>> Pi is roughly 3.14318 >>>> >>>> root@ip-172-31-60-53:~# >>>> >>>> I'm now trying to run the same example with the spark-submit '--master' >>>> option set to either 'yarn-cluster' or 'yarn-client' >>>> but I keep getting the same error : >>>> >>>> root@ip-172-31-60-53:~# spark-submit --class >>>> org.apache.spark.examples.SparkPi --master yarn-client >>>> --num-executors 3 --driver-memory 1g --executor-memory 1g >>>> --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10 >>>> Spark assembly has been built with Hive, including Datanucleus jars on >>>> classpath >>>> Exception in thread "main" java.lang.Exception: When running with >>>> master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in >>>> the environment. >>>> >>>> But on my spark-master/0 machine there is no /etc/hadoop/conf directory. >>>> So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ? >>>> Do I need to add a juju relation between spark-master and ... >>>> yarn-hdfs-master to make them aware of each other ? >>>> >>>> Thanks for any help, >>>> >>>> Ken >>>> >>>> >>>> >>>> >>>> >>>> On 28 January 2015 at 19:32, Kapil Thangavelu < >>>> kapil.thangav...@canonical.com> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams <ke...@theasi.co> wrote: >>>>> >>>>>> >>>>>> Hi Sam/Amir, >>>>>> >>>>>> I've been able to 'juju ssh spark-master/0' and I successfully >>>>>> ran the two >>>>>> simple examples for pyspark and spark-shell, >>>>>> >>>>>> ./bin/pyspark >>>>>> >>> sc.parallelize(range(1000)).count() >>>>>> 1000 >>>>>> >>>>>> ./bin/spark-shell >>>>>> scala> sc.parallelize(1 to 1000).count() >>>>>> 1000 >>>>>> >>>>>> >>>>>> Now I want to run some of the spark examples in the >>>>>> spark-exampes*.jar >>>>>> file, which I have on my local machine. How do I copy the jar file >>>>>> from >>>>>> my local machine to the AWS machine ? >>>>>> >>>>>> I have tried 'scp' and 'juju scp' from the local command-line but >>>>>> both fail (below), >>>>>> >>>>>> root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar >>>>>> ubuntu@ip-172-31-59:/tmp >>>>>> ssh: Could not resolve hostname ip-172-31-59: Name or service not >>>>>> known >>>>>> lost connection >>>>>> root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar >>>>>> ubuntu@ip-172-31-59:/tmp >>>>>> ERROR exit status 1 (nc: getaddrinfo: Name or service not known) >>>>>> >>>>>> Any ideas ? >>>>>> >>>>> >>>>> juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp >>>>> >>>>> >>>>>> >>>>>> Ken >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 28 January 2015 at 17:29, Samuel Cozannet < >>>>>> samuel.cozan...@canonical.com> wrote: >>>>>> >>>>>>> Glad it worked! >>>>>>> >>>>>>> I'll make a merge request to the upstream so that it works natively >>>>>>> from the store asap. >>>>>>> >>>>>>> Thanks for catching that! >>>>>>> Samuel >>>>>>> >>>>>>> Best, >>>>>>> Samuel >>>>>>> >>>>>>> -- >>>>>>> Samuel Cozannet >>>>>>> Cloud, Big Data and IoT Strategy Team >>>>>>> Business Development - Cloud and ISV Ecosystem >>>>>>> Changing the Future of Cloud >>>>>>> Ubuntu <http://ubuntu.com> / Canonical UK LTD >>>>>>> <http://canonical.com> / Juju <https://jujucharms.com> >>>>>>> samuel.cozan...@canonical.com >>>>>>> mob: +33 616 702 389 >>>>>>> skype: samnco >>>>>>> Twitter: @SaMnCo_23 >>>>>>> >>>>>>> On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams <ke...@theasi.co> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> Hi Sam (and Maarten), >>>>>>>> >>>>>>>> Cloning Spark 1.2.0 from github seems to have worked! >>>>>>>> I can install the Spark examples afterwards. >>>>>>>> >>>>>>>> Thanks for all your help! >>>>>>>> >>>>>>>> Yes - Andrew and Angie both say 'hi' :-) >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> >>>>>>>> Ken >>>>>>>> >>>>>>>> >>>>>>>> On 28 January 2015 at 16:43, Samuel Cozannet < >>>>>>>> samuel.cozan...@canonical.com> wrote: >>>>>>>> >>>>>>>>> Hey Ken, >>>>>>>>> >>>>>>>>> So I had a closer look to your Spark problem and found out what >>>>>>>>> went wrong. >>>>>>>>> >>>>>>>>> The charm available on the charmstore is trying to download Spark >>>>>>>>> 1.0.2, and the versions available on the Apache website are 1.1.0, >>>>>>>>> 1.1.1 >>>>>>>>> and 1.2.0. >>>>>>>>> >>>>>>>>> There is another version of the charm available on GitHub that >>>>>>>>> actually will deploy 1.2.0 >>>>>>>>> >>>>>>>>> 1. On your computer, the below folders & get there: >>>>>>>>> >>>>>>>>> cd ~ >>>>>>>>> mkdir charms >>>>>>>>> mkdir charms/trusty >>>>>>>>> cd charms/trusty >>>>>>>>> >>>>>>>>> 2. Branch the Spark charm. >>>>>>>>> >>>>>>>>> git clone https://github.com/Archethought/spark-charm spark >>>>>>>>> >>>>>>>>> 3. Deploy Spark from local repository >>>>>>>>> >>>>>>>>> juju deploy --repository=~/charms local:trusty/spark spark-master >>>>>>>>> juju deploy --repository=~/charms local:trusty/spark spark-slave >>>>>>>>> juju add-relation spark-master:master spark-slave:slave >>>>>>>>> >>>>>>>>> Worked on AWS for me just minutes ago. Let me know how it goes for >>>>>>>>> you. Note that this version of the charm does NOT install the Spark >>>>>>>>> examples. The files are present though, so you'll find them in >>>>>>>>> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive >>>>>>>>> >>>>>>>>> Hope that helps... >>>>>>>>> Let me know if it works for you! >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Sam >>>>>>>>> >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Samuel >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Samuel Cozannet >>>>>>>>> Cloud, Big Data and IoT Strategy Team >>>>>>>>> Business Development - Cloud and ISV Ecosystem >>>>>>>>> Changing the Future of Cloud >>>>>>>>> Ubuntu <http://ubuntu.com> / Canonical UK LTD >>>>>>>>> <http://canonical.com> / Juju <https://jujucharms.com> >>>>>>>>> samuel.cozan...@canonical.com >>>>>>>>> mob: +33 616 702 389 >>>>>>>>> skype: samnco >>>>>>>>> Twitter: @SaMnCo_23 >>>>>>>>> >>>>>>>>> On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams <ke...@theasi.co> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi folks, >>>>>>>>>> >>>>>>>>>> I'm completely new to juju so any help is appreciated. >>>>>>>>>> >>>>>>>>>> I'm trying to create a hadoop/analytics-type platform. >>>>>>>>>> >>>>>>>>>> I've managed to install the 'data-analytics-with-sql-like' bundle >>>>>>>>>> (using this command) >>>>>>>>>> >>>>>>>>>> juju quickstart >>>>>>>>>> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like >>>>>>>>>> >>>>>>>>>> This is very impressive, and gives me virtually everything that I >>>>>>>>>> want >>>>>>>>>> (hadoop, hive, etc) - but I also need Spark. >>>>>>>>>> >>>>>>>>>> The Spark charm ( >>>>>>>>>> http://manage.jujucharms.com/~asanjar/trusty/spark) >>>>>>>>>> and bundle ( >>>>>>>>>> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster) >>>>>>>>>> however do not seem stable or available and I can't figure out >>>>>>>>>> how to install them. >>>>>>>>>> >>>>>>>>>> Should I just download and install the Spark tar-ball on the nodes >>>>>>>>>> in my AWS cluster, or is there a better way to do this ? >>>>>>>>>> >>>>>>>>>> Thanks in advance, >>>>>>>>>> >>>>>>>>>> Ken >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Juju mailing list >>>>>>>>>> Juju@lists.ubuntu.com >>>>>>>>>> Modify settings or unsubscribe at: >>>>>>>>>> https://lists.ubuntu.com/mailman/listinfo/juju >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Juju mailing list >>>>>> Juju@lists.ubuntu.com >>>>>> Modify settings or unsubscribe at: >>>>>> https://lists.ubuntu.com/mailman/listinfo/juju >>>>>> >>>>>> >>>>> >>>> >>> >> >
-- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju