Hi Russell

I use Jupyter python notebooks a lot. Here is how I start the server

set -x # turn debugging on

#set +x # turn debugging off



# https://github.com/databricks/spark-csv

# http://spark-packages.org/package/datastax/spark-cassandra-connector

#https://github.com/datastax/spark-cassandra-connector/blob/master/doc/15_py
thon.md

# 
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/15_pyt
hon.md#pyspark-with-data-frames



# packages are ',' seperate with no white space

extraPkgs="--packages
com.databricks:spark-csv_2.11:1.3.0,datastax:spark-cassandra-connector:1.6.0
-M1-s_2.10"



export PYSPARK_PYTHON=python3

export PYSPARK_DRIVER_PYTHON=python3

IPYTHON_OPTS=notebook $SPARK_ROOT/bin/pyspark $extraPkgs --conf
spark.cassandra.connection.host=ec2-54-153-102-232.us-west-1.compute.amazona
ws.com $*



From:  Russell Jurney <russell.jur...@gmail.com>
Date:  Sunday, March 27, 2016 at 7:22 PM
To:  "user @spark" <user@spark.apache.org>
Subject:  --packages configuration equivalent item name?

> I run PySpark with CSV support like so: IPYTHON=1 pyspark --packages
> com.databricks:spark-csv_2.10:1.4.0
> 
> I don't want to type this --packages argument each time. Is there a config
> item for --packages? I can't find one in the reference at
> http://spark.apache.org/docs/latest/configuration.html
> 
> If there is no way to do this, please let me know so I can make a JIRA for
> this feature.
> 
> Thanks!
> -- 
> Russell Jurney twitter.com/rjurney <http://twitter.com/rjurney>
> russell.jur...@gmail.com relato.io <http://relato.io>


Reply via email to