You forgot to create a SparkContext instance: sc = SparkContext()
On Tue, Nov 3, 2015 at 9:59 AM, Andy Davidson <a...@santacruzintegration.com> wrote: > I am having a heck of a time getting Ipython notebooks to work on my 1.5.1 > AWS cluster I created using spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2 > > I have read the instructions for using iPython notebook on > http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell > > I want to run the notebook server on my master and use an ssh tunnel to > connect a web browser running on my mac. > > I am confident the cluster is set up correctly because the sparkPi example > runs. > > I am able to use IPython notebooks on my local mac and work with spark and > local files with out any problems. > > I know the ssh tunnel is working. > > On my cluster I am able to use python shell in general > > [ec2-user@ip-172-31-29-60 dataScience]$ /root/spark/bin/pyspark --master > local[2] > > >>>> from pyspark import SparkContext > >>>> textFile = sc.textFile("file:///home/ec2-user/dataScience/readme.txt") > >>>> textFile.take(1) > > > > When I run the exact same code in iPython notebook I get > > --------------------------------------------------------------------------- > NameError Traceback (most recent call last) > <ipython-input-1-ba11b935529e> in <module>() > 11 from pyspark import SparkContext, SparkConf > 12 > ---> 13 textFile = > sc.textFile("file:///home/ec2-user/dataScience/readme.txt") > 14 > 15 textFile.take(1) > > NameError: name 'sc' is not defined > > > > > To try an debug I wrote a script to launch pyspark and added ‘set –x’ to > pyspark so I could see what the script was doing > > Any idea how I can debug this? > > Thanks in advance > > Andy > > $ cat notebook.sh > > set -x > > export PYSPARK_DRIVER_PYTHON=ipython > > export PYSPARK_DRIVER_PYTHON_OPTS="notebook --no-browser --port=7000" > > /root/spark/bin/pyspark --master local[2] > > > > > [ec2-user@ip-172-31-29-60 dataScience]$ ./notebook.sh > > ++ export PYSPARK_DRIVER_PYTHON=ipython > > ++ PYSPARK_DRIVER_PYTHON=ipython > > ++ export 'PYSPARK_DRIVER_PYTHON_OPTS=notebook --no-browser --port=7000' > > ++ PYSPARK_DRIVER_PYTHON_OPTS='notebook --no-browser --port=7000' > > ++ /root/spark/bin/pyspark --master 'local[2]' > > +++ dirname /root/spark/bin/pyspark > > ++ cd /root/spark/bin/.. > > ++ pwd > > + export SPARK_HOME=/root/spark > > + SPARK_HOME=/root/spark > > + source /root/spark/bin/load-spark-env.sh > > ++++ dirname /root/spark/bin/pyspark > > +++ cd /root/spark/bin/.. > > +++ pwd > > ++ FWDIR=/root/spark > > ++ '[' -z '' ']' > > ++ export SPARK_ENV_LOADED=1 > > ++ SPARK_ENV_LOADED=1 > > ++++ dirname /root/spark/bin/pyspark > > +++ cd /root/spark/bin/.. > > +++ pwd > > ++ parent_dir=/root/spark > > ++ user_conf_dir=/root/spark/conf > > ++ '[' -f /root/spark/conf/spark-env.sh ']' > > ++ set -a > > ++ . /root/spark/conf/spark-env.sh > > +++ export JAVA_HOME=/usr/java/latest > > +++ JAVA_HOME=/usr/java/latest > > +++ export SPARK_LOCAL_DIRS=/mnt/spark,/mnt2/spark > > +++ SPARK_LOCAL_DIRS=/mnt/spark,/mnt2/spark > > +++ export SPARK_MASTER_OPTS= > > +++ SPARK_MASTER_OPTS= > > +++ '[' -n 1 ']' > > +++ export SPARK_WORKER_INSTANCES=1 > > +++ SPARK_WORKER_INSTANCES=1 > > +++ export SPARK_WORKER_CORES=2 > > +++ SPARK_WORKER_CORES=2 > > +++ export HADOOP_HOME=/root/ephemeral-hdfs > > +++ HADOOP_HOME=/root/ephemeral-hdfs > > +++ export > SPARK_MASTER_IP=ec2-54-215-207-132.us-west-1.compute.amazonaws.com > > +++ SPARK_MASTER_IP=ec2-54-215-207-132.us-west-1.compute.amazonaws.com > > ++++ cat /root/spark-ec2/cluster-url > > +++ export > MASTER=spark://ec2-54-215-207-132.us-west-1.compute.amazonaws.com:7077 > > +++ MASTER=spark://ec2-54-215-207-132.us-west-1.compute.amazonaws.com:7077 > > +++ export SPARK_SUBMIT_LIBRARY_PATH=:/root/ephemeral-hdfs/lib/native/ > > +++ SPARK_SUBMIT_LIBRARY_PATH=:/root/ephemeral-hdfs/lib/native/ > > +++ export SPARK_SUBMIT_CLASSPATH=::/root/ephemeral-hdfs/conf > > +++ SPARK_SUBMIT_CLASSPATH=::/root/ephemeral-hdfs/conf > > ++++ wget -q -O - http://169.254.169.254/latest/meta-data/public-hostname > > +++ export > SPARK_PUBLIC_DNS=ec2-54-215-207-132.us-west-1.compute.amazonaws.com > > +++ SPARK_PUBLIC_DNS=ec2-54-215-207-132.us-west-1.compute.amazonaws.com > > +++ export YARN_CONF_DIR=/root/ephemeral-hdfs/conf > > +++ YARN_CONF_DIR=/root/ephemeral-hdfs/conf > > ++++ id -u > > +++ '[' 222 == 0 ']' > > ++ set +a > > ++ '[' -z '' ']' > > ++ ASSEMBLY_DIR2=/root/spark/assembly/target/scala-2.11 > > ++ ASSEMBLY_DIR1=/root/spark/assembly/target/scala-2.10 > > ++ [[ -d /root/spark/assembly/target/scala-2.11 ]] > > ++ '[' -d /root/spark/assembly/target/scala-2.11 ']' > > ++ export SPARK_SCALA_VERSION=2.10 > > ++ SPARK_SCALA_VERSION=2.10 > > + export '_SPARK_CMD_USAGE=Usage: ./bin/pyspark [options]' > > + _SPARK_CMD_USAGE='Usage: ./bin/pyspark [options]' > > + hash python2.7 > > + DEFAULT_PYTHON=python2.7 > > + [[ -n '' ]] > > + [[ '' == \1 ]] > > + [[ -z ipython ]] > > + [[ -z '' ]] > > + [[ ipython == *ipython* ]] > > + [[ python2.7 != \p\y\t\h\o\n\2\.\7 ]] > > + PYSPARK_PYTHON=python2.7 > > + export PYSPARK_PYTHON > > + export PYTHONPATH=/root/spark/python/: > > + PYTHONPATH=/root/spark/python/: > > + export > PYTHONPATH=/root/spark/python/lib/py4j-0.8.2.1-src.zip:/root/spark/python/: > > + > PYTHONPATH=/root/spark/python/lib/py4j-0.8.2.1-src.zip:/root/spark/python/: > > + export OLD_PYTHONSTARTUP= > > + OLD_PYTHONSTARTUP= > > + export PYTHONSTARTUP=/root/spark/python/pyspark/shell.py > > + PYTHONSTARTUP=/root/spark/python/pyspark/shell.py > > + [[ -n '' ]] > > + export PYSPARK_DRIVER_PYTHON > > + export PYSPARK_DRIVER_PYTHON_OPTS > > + exec /root/spark/bin/spark-submit pyspark-shell-main --name PySparkShell > --master 'local[2]' > > [NotebookApp] Using existing profile dir: > u'/home/ec2-user/.ipython/profile_default' > > [NotebookApp] Serving notebooks from /home/ec2-user/dataScience > > [NotebookApp] The IPython Notebook is running at: http://127.0.0.1:7000/ > > [NotebookApp] Use Control-C to stop this server and shut down all kernels. > > [NotebookApp] Using MathJax from CDN: > http://cdn.mathjax.org/mathjax/latest/MathJax.js > > [NotebookApp] Kernel started: 2ba6864b-0dc8-4814-8e05-4f532cb40e2b > > [NotebookApp] Connecting to: tcp://127.0.0.1:55099 > > [NotebookApp] Connecting to: tcp://127.0.0.1:48994 > > [NotebookApp] Connecting to: tcp://127.0.0.1:57214 > > [IPKernelApp] To connect another client to this kernel, use: > > [IPKernelApp] --existing kernel-2ba6864b-0dc8-4814-8e05-4f532cb40e2b.json > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org