Thanks, Kapil - this works :-) I can now run the SparkPi example successfully. root@ip-172-31-60-53:~# spark-submit --class org.apache.spark.examples.SparkPi /tmp/spark-examples-1.2.0-hadoop2.4.0.jar Spark assembly has been built with Hive, including Datanucleus jars on classpath 15/01/30 10:29:33 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Pi is roughly 3.14318
root@ip-172-31-60-53:~# I'm now trying to run the same example with the spark-submit '--master' option set to either 'yarn-cluster' or 'yarn-client' but I keep getting the same error : root@ip-172-31-60-53:~# spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 1g --executor-memory 1g --executor-cores 1 --queue thequeue lib/spark-examples*.jar 10 Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread "main" java.lang.Exception: When running with master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment. But on my spark-master/0 machine there is no /etc/hadoop/conf directory. So what should the HADOOP_CONF_DIR or YARN_CONF_DIR value be ? Do I need to add a juju relation between spark-master and ... yarn-hdfs-master to make them aware of each other ? Thanks for any help, Ken On 28 January 2015 at 19:32, Kapil Thangavelu < kapil.thangav...@canonical.com> wrote: > > > On Wed, Jan 28, 2015 at 1:54 PM, Ken Williams <ke...@theasi.co> wrote: > >> >> Hi Sam/Amir, >> >> I've been able to 'juju ssh spark-master/0' and I successfully ran >> the two >> simple examples for pyspark and spark-shell, >> >> ./bin/pyspark >> >>> sc.parallelize(range(1000)).count() >> 1000 >> >> ./bin/spark-shell >> scala> sc.parallelize(1 to 1000).count() >> 1000 >> >> >> Now I want to run some of the spark examples in the spark-exampes*.jar >> file, which I have on my local machine. How do I copy the jar file from >> my local machine to the AWS machine ? >> >> I have tried 'scp' and 'juju scp' from the local command-line but both >> fail (below), >> >> root@adminuser:~# scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar >> ubuntu@ip-172-31-59:/tmp >> ssh: Could not resolve hostname ip-172-31-59: Name or service not known >> lost connection >> root@adminuser:~# juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar >> ubuntu@ip-172-31-59:/tmp >> ERROR exit status 1 (nc: getaddrinfo: Name or service not known) >> >> Any ideas ? >> > > juju scp /tmp/spark-examples-1.2.0-hadoop2.4.0.jar spark-master/0:/tmp > >> >> Ken >> >> >> >> >> >> >> >> >> >> >> >> On 28 January 2015 at 17:29, Samuel Cozannet < >> samuel.cozan...@canonical.com> wrote: >> >>> Glad it worked! >>> >>> I'll make a merge request to the upstream so that it works natively from >>> the store asap. >>> >>> Thanks for catching that! >>> Samuel >>> >>> Best, >>> Samuel >>> >>> -- >>> Samuel Cozannet >>> Cloud, Big Data and IoT Strategy Team >>> Business Development - Cloud and ISV Ecosystem >>> Changing the Future of Cloud >>> Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> / >>> Juju <https://jujucharms.com> >>> samuel.cozan...@canonical.com >>> mob: +33 616 702 389 >>> skype: samnco >>> Twitter: @SaMnCo_23 >>> >>> On Wed, Jan 28, 2015 at 6:15 PM, Ken Williams <ke...@theasi.co> wrote: >>> >>>> >>>> Hi Sam (and Maarten), >>>> >>>> Cloning Spark 1.2.0 from github seems to have worked! >>>> I can install the Spark examples afterwards. >>>> >>>> Thanks for all your help! >>>> >>>> Yes - Andrew and Angie both say 'hi' :-) >>>> >>>> Best Regards, >>>> >>>> Ken >>>> >>>> >>>> On 28 January 2015 at 16:43, Samuel Cozannet < >>>> samuel.cozan...@canonical.com> wrote: >>>> >>>>> Hey Ken, >>>>> >>>>> So I had a closer look to your Spark problem and found out what went >>>>> wrong. >>>>> >>>>> The charm available on the charmstore is trying to download Spark >>>>> 1.0.2, and the versions available on the Apache website are 1.1.0, 1.1.1 >>>>> and 1.2.0. >>>>> >>>>> There is another version of the charm available on GitHub that >>>>> actually will deploy 1.2.0 >>>>> >>>>> 1. On your computer, the below folders & get there: >>>>> >>>>> cd ~ >>>>> mkdir charms >>>>> mkdir charms/trusty >>>>> cd charms/trusty >>>>> >>>>> 2. Branch the Spark charm. >>>>> >>>>> git clone https://github.com/Archethought/spark-charm spark >>>>> >>>>> 3. Deploy Spark from local repository >>>>> >>>>> juju deploy --repository=~/charms local:trusty/spark spark-master >>>>> juju deploy --repository=~/charms local:trusty/spark spark-slave >>>>> juju add-relation spark-master:master spark-slave:slave >>>>> >>>>> Worked on AWS for me just minutes ago. Let me know how it goes for >>>>> you. Note that this version of the charm does NOT install the Spark >>>>> examples. The files are present though, so you'll find them in >>>>> /var/lib/juju/agents/unit-spark-master-0/charm/files/archive >>>>> >>>>> Hope that helps... >>>>> Let me know if it works for you! >>>>> >>>>> Best, >>>>> Sam >>>>> >>>>> >>>>> Best, >>>>> Samuel >>>>> >>>>> -- >>>>> Samuel Cozannet >>>>> Cloud, Big Data and IoT Strategy Team >>>>> Business Development - Cloud and ISV Ecosystem >>>>> Changing the Future of Cloud >>>>> Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> / >>>>> Juju <https://jujucharms.com> >>>>> samuel.cozan...@canonical.com >>>>> mob: +33 616 702 389 >>>>> skype: samnco >>>>> Twitter: @SaMnCo_23 >>>>> >>>>> On Wed, Jan 28, 2015 at 4:44 PM, Ken Williams <ke...@theasi.co> wrote: >>>>> >>>>>> >>>>>> Hi folks, >>>>>> >>>>>> I'm completely new to juju so any help is appreciated. >>>>>> >>>>>> I'm trying to create a hadoop/analytics-type platform. >>>>>> >>>>>> I've managed to install the 'data-analytics-with-sql-like' bundle >>>>>> (using this command) >>>>>> >>>>>> juju quickstart >>>>>> bundle:data-analytics-with-sql-like/data-analytics-with-sql-like >>>>>> >>>>>> This is very impressive, and gives me virtually everything that I >>>>>> want >>>>>> (hadoop, hive, etc) - but I also need Spark. >>>>>> >>>>>> The Spark charm (http://manage.jujucharms.com/~asanjar/trusty/spark) >>>>>> and bundle ( >>>>>> http://manage.jujucharms.com/bundle/~asanjar/spark/spark-cluster) >>>>>> however do not seem stable or available and I can't figure out how to >>>>>> install them. >>>>>> >>>>>> Should I just download and install the Spark tar-ball on the nodes >>>>>> in my AWS cluster, or is there a better way to do this ? >>>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> Ken >>>>>> >>>>>> >>>>>> -- >>>>>> Juju mailing list >>>>>> Juju@lists.ubuntu.com >>>>>> Modify settings or unsubscribe at: >>>>>> https://lists.ubuntu.com/mailman/listinfo/juju >>>>>> >>>>>> >>>>> >>>> >>> >> >> -- >> Juju mailing list >> Juju@lists.ubuntu.com >> Modify settings or unsubscribe at: >> https://lists.ubuntu.com/mailman/listinfo/juju >> >> >
-- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju