You forgot to create a SparkContext instance:

sc = SparkContext()

On Tue, Nov 3, 2015 at 9:59 AM, Andy Davidson
<a...@santacruzintegration.com> wrote:
> I am having a heck of a time getting Ipython notebooks to work on my 1.5.1
> AWS cluster I created using spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2
>
> I have read the instructions for using iPython notebook on
> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell
>
> I want to run the notebook server on my master and use an ssh tunnel to
> connect a web browser running on my mac.
>
> I am confident the cluster is set up correctly because the sparkPi example
> runs.
>
> I am able to use IPython notebooks on my local mac and work with spark and
> local files with out any problems.
>
> I know the ssh tunnel is working.
>
> On my cluster I am able to use python shell in general
>
> [ec2-user@ip-172-31-29-60 dataScience]$ /root/spark/bin/pyspark --master
> local[2]
>
>
>>>> from pyspark import SparkContext
>
>>>> textFile = sc.textFile("file:///home/ec2-user/dataScience/readme.txt")
>
>>>> textFile.take(1)
>
>
>
> When I run the exact same code in iPython notebook I get
>
> ---------------------------------------------------------------------------
> NameError                                 Traceback (most recent call last)
> <ipython-input-1-ba11b935529e> in <module>()
>      11 from pyspark import SparkContext, SparkConf
>      12
> ---> 13 textFile =
> sc.textFile("file:///home/ec2-user/dataScience/readme.txt")
>      14
>      15 textFile.take(1)
>
> NameError: name 'sc' is not defined
>
>
>
>
> To try an debug I wrote a script to launch pyspark and added ‘set –x’ to
> pyspark so I could see what the script was doing
>
> Any idea how I can debug this?
>
> Thanks in advance
>
> Andy
>
> $ cat notebook.sh
>
> set -x
>
> export PYSPARK_DRIVER_PYTHON=ipython
>
> export PYSPARK_DRIVER_PYTHON_OPTS="notebook --no-browser --port=7000"
>
> /root/spark/bin/pyspark --master local[2]
>
>
>
>
> [ec2-user@ip-172-31-29-60 dataScience]$ ./notebook.sh
>
> ++ export PYSPARK_DRIVER_PYTHON=ipython
>
> ++ PYSPARK_DRIVER_PYTHON=ipython
>
> ++ export 'PYSPARK_DRIVER_PYTHON_OPTS=notebook --no-browser --port=7000'
>
> ++ PYSPARK_DRIVER_PYTHON_OPTS='notebook --no-browser --port=7000'
>
> ++ /root/spark/bin/pyspark --master 'local[2]'
>
> +++ dirname /root/spark/bin/pyspark
>
> ++ cd /root/spark/bin/..
>
> ++ pwd
>
> + export SPARK_HOME=/root/spark
>
> + SPARK_HOME=/root/spark
>
> + source /root/spark/bin/load-spark-env.sh
>
> ++++ dirname /root/spark/bin/pyspark
>
> +++ cd /root/spark/bin/..
>
> +++ pwd
>
> ++ FWDIR=/root/spark
>
> ++ '[' -z '' ']'
>
> ++ export SPARK_ENV_LOADED=1
>
> ++ SPARK_ENV_LOADED=1
>
> ++++ dirname /root/spark/bin/pyspark
>
> +++ cd /root/spark/bin/..
>
> +++ pwd
>
> ++ parent_dir=/root/spark
>
> ++ user_conf_dir=/root/spark/conf
>
> ++ '[' -f /root/spark/conf/spark-env.sh ']'
>
> ++ set -a
>
> ++ . /root/spark/conf/spark-env.sh
>
> +++ export JAVA_HOME=/usr/java/latest
>
> +++ JAVA_HOME=/usr/java/latest
>
> +++ export SPARK_LOCAL_DIRS=/mnt/spark,/mnt2/spark
>
> +++ SPARK_LOCAL_DIRS=/mnt/spark,/mnt2/spark
>
> +++ export SPARK_MASTER_OPTS=
>
> +++ SPARK_MASTER_OPTS=
>
> +++ '[' -n 1 ']'
>
> +++ export SPARK_WORKER_INSTANCES=1
>
> +++ SPARK_WORKER_INSTANCES=1
>
> +++ export SPARK_WORKER_CORES=2
>
> +++ SPARK_WORKER_CORES=2
>
> +++ export HADOOP_HOME=/root/ephemeral-hdfs
>
> +++ HADOOP_HOME=/root/ephemeral-hdfs
>
> +++ export
> SPARK_MASTER_IP=ec2-54-215-207-132.us-west-1.compute.amazonaws.com
>
> +++ SPARK_MASTER_IP=ec2-54-215-207-132.us-west-1.compute.amazonaws.com
>
> ++++ cat /root/spark-ec2/cluster-url
>
> +++ export
> MASTER=spark://ec2-54-215-207-132.us-west-1.compute.amazonaws.com:7077
>
> +++ MASTER=spark://ec2-54-215-207-132.us-west-1.compute.amazonaws.com:7077
>
> +++ export SPARK_SUBMIT_LIBRARY_PATH=:/root/ephemeral-hdfs/lib/native/
>
> +++ SPARK_SUBMIT_LIBRARY_PATH=:/root/ephemeral-hdfs/lib/native/
>
> +++ export SPARK_SUBMIT_CLASSPATH=::/root/ephemeral-hdfs/conf
>
> +++ SPARK_SUBMIT_CLASSPATH=::/root/ephemeral-hdfs/conf
>
> ++++ wget -q -O - http://169.254.169.254/latest/meta-data/public-hostname
>
> +++ export
> SPARK_PUBLIC_DNS=ec2-54-215-207-132.us-west-1.compute.amazonaws.com
>
> +++ SPARK_PUBLIC_DNS=ec2-54-215-207-132.us-west-1.compute.amazonaws.com
>
> +++ export YARN_CONF_DIR=/root/ephemeral-hdfs/conf
>
> +++ YARN_CONF_DIR=/root/ephemeral-hdfs/conf
>
> ++++ id -u
>
> +++ '[' 222 == 0 ']'
>
> ++ set +a
>
> ++ '[' -z '' ']'
>
> ++ ASSEMBLY_DIR2=/root/spark/assembly/target/scala-2.11
>
> ++ ASSEMBLY_DIR1=/root/spark/assembly/target/scala-2.10
>
> ++ [[ -d /root/spark/assembly/target/scala-2.11 ]]
>
> ++ '[' -d /root/spark/assembly/target/scala-2.11 ']'
>
> ++ export SPARK_SCALA_VERSION=2.10
>
> ++ SPARK_SCALA_VERSION=2.10
>
> + export '_SPARK_CMD_USAGE=Usage: ./bin/pyspark [options]'
>
> + _SPARK_CMD_USAGE='Usage: ./bin/pyspark [options]'
>
> + hash python2.7
>
> + DEFAULT_PYTHON=python2.7
>
> + [[ -n '' ]]
>
> + [[ '' == \1 ]]
>
> + [[ -z ipython ]]
>
> + [[ -z '' ]]
>
> + [[ ipython == *ipython* ]]
>
> + [[ python2.7 != \p\y\t\h\o\n\2\.\7 ]]
>
> + PYSPARK_PYTHON=python2.7
>
> + export PYSPARK_PYTHON
>
> + export PYTHONPATH=/root/spark/python/:
>
> + PYTHONPATH=/root/spark/python/:
>
> + export
> PYTHONPATH=/root/spark/python/lib/py4j-0.8.2.1-src.zip:/root/spark/python/:
>
> +
> PYTHONPATH=/root/spark/python/lib/py4j-0.8.2.1-src.zip:/root/spark/python/:
>
> + export OLD_PYTHONSTARTUP=
>
> + OLD_PYTHONSTARTUP=
>
> + export PYTHONSTARTUP=/root/spark/python/pyspark/shell.py
>
> + PYTHONSTARTUP=/root/spark/python/pyspark/shell.py
>
> + [[ -n '' ]]
>
> + export PYSPARK_DRIVER_PYTHON
>
> + export PYSPARK_DRIVER_PYTHON_OPTS
>
> + exec /root/spark/bin/spark-submit pyspark-shell-main --name PySparkShell
> --master 'local[2]'
>
> [NotebookApp] Using existing profile dir:
> u'/home/ec2-user/.ipython/profile_default'
>
> [NotebookApp] Serving notebooks from /home/ec2-user/dataScience
>
> [NotebookApp] The IPython Notebook is running at: http://127.0.0.1:7000/
>
> [NotebookApp] Use Control-C to stop this server and shut down all kernels.
>
> [NotebookApp] Using MathJax from CDN:
> http://cdn.mathjax.org/mathjax/latest/MathJax.js
>
> [NotebookApp] Kernel started: 2ba6864b-0dc8-4814-8e05-4f532cb40e2b
>
> [NotebookApp] Connecting to: tcp://127.0.0.1:55099
>
> [NotebookApp] Connecting to: tcp://127.0.0.1:48994
>
> [NotebookApp] Connecting to: tcp://127.0.0.1:57214
>
> [IPKernelApp] To connect another client to this kernel, use:
>
> [IPKernelApp] --existing kernel-2ba6864b-0dc8-4814-8e05-4f532cb40e2b.json
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to