You can check a script that I created for the Amazon cloud:
https://snippetessay.wordpress.com/2015/04/18/big-data-lab-in-the-cloud-with-hadoopsparkrpython/

If I remember correctly then you need to add something to the startup py for 
ipython
> On 03 Nov 2015, at 01:04, Andy Davidson <a...@santacruzintegration.com> wrote:
> 
> Hi
> 
> I recently installed a new cluster using the 
> spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2. SparkPi sample app works correctly. 
> 
> I am trying to run iPython notebook on my cluster master and use an ssh 
> tunnel so that I can work with the notebook in a browser running on my mac. 
> Bellow is how I set up the ssh tunnel
> 
>       $ ssh -i $KEY_FILE -N -f -L localhost:8888:localhost:7000 
> ec2-user@$SPARK_MASTER
> 
>       $ ssh -i $KEY_FILE ec2-user@$SPARK_MASTER
>       $ cd top level notebook dir
>       $ IPYTHON_OPTS="notebook --no-browser --port=7000" 
> /root/spark/bin/pyspark
> 
> I am able to access my notebooks in the browser by opening 
> http://localhost:8888
> 
> When I run the following python code I get an error NameError: name 'sc' is 
> not defined? Any idea what the problem might be? 
> 
> I looked through pyspark and tried various combinations of the following but 
> still get the same error
> 
> $ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook 
> --no-browser --port=7000" /root/spark/bin/pyspark --master=local[2]
> 
> Kind regards
> 
> Andy
> 
> 
> 
> 
> 
> In [1]:
> 
> import sys
> print (sys.version)
>  
> import os
> print(os.getcwd() + "\n")
> 2.6.9 (unknown, Apr  1 2015, 18:16:00) 
> [GCC 4.8.2 20140120 (Red Hat 4.8.2-16)]
> /home/ec2-user/dataScience
> 
> In [2]:
> 
> from pyspark import SparkContext
> textFile = sc.textFile("readme.txt")
> textFile.take(1)
> ---------------------------------------------------------------------------
> NameError                                 Traceback (most recent call last)
> <ipython-input-2-b67a9be29bd9> in <module>()
>       1 from pyspark import SparkContext
> ----> 2 textFile = sc.textFile("readme.txt")
>       3 textFile.take(1)
> 
> NameError: name 'sc' is not defined
> 
> In [ ]:
> 
>  

Reply via email to