I found the problem. I was manually constructing the CLASSPATH and SPARK_CLASSPATH because I needed jars for running the cassandra lib.
For some reason that I cannot explain, it was this that was causing the issue. Maybe one of the jars had a log4j.properties rolled up in it? I removed almost all of the jars from the classpath and it began to use the SPARK_HOME/conf/log4j.properties On Wed, Oct 1, 2014 at 3:46 PM, Rick Richardson <rick.richard...@gmail.com> wrote: > Out of curiosity, how do you actually launch pyspark in your set-up? > > On Wed, Oct 1, 2014 at 3:44 PM, Rick Richardson <rick.richard...@gmail.com > > wrote: > >> Here is the other relevant bit of my set-up: >> MASTER=spark://sparkmaster:7077 >> IPYTHON_OPTS="notebook --pylab inline --ip=0.0.0.0" >> CASSANDRA_NODES="cassandra1|cassandra2|cassandra3" >> PYSPARK_SUBMIT_ARGS="--master $MASTER --deploy-mode client >> --num-executors 6 --executor-memory 1g --executor-cores 1" ipython notebook >> --profile=pyspark >> >> On Wed, Oct 1, 2014 at 3:41 PM, Rick Richardson < >> rick.richard...@gmail.com> wrote: >> >>> I was starting PySpark as a profile within IPython Notebook as per: >>> >>> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/ >>> >>> >>> My setup looks like: >>> import os >>> import sys >>> >>> spark_home = os.environ.get('SPARK_HOME', None) >>> if not spark_home: >>> raise ValueError('SPARK_HOME environment variable is not set') >>> sys.path.insert(0, os.path.join(spark_home, 'python')) >>> sys.path.insert(0, os.path.join(spark_home, >>> 'python/lib/py4j-0.8.2.1-src.zip')) >>> execfile(os.path.join(spark_home, 'python/pyspark/shell.py')) >>> >>> I also have some code to expand all of the jars (and the log4j.property) >>> in SPARK_HOME and CASSANDRA_HOME and add them to the SPARK_CLASSPATH >>> >>> >>> I'll try your launch method and see how that goes. >>> >>> On Wed, Oct 1, 2014 at 3:31 PM, Davies Liu <dav...@databricks.com> >>> wrote: >>> >>>> How do you setup IPython to access pyspark in notebook? >>>> >>>> I did as following, it worked for me: >>>> >>>> $ export SPARK_HOME=/opt/spark-1.1.0/ >>>> $ export >>>> PYTHONPATH=/opt/spark-1.1.0/python:/opt/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip >>>> $ ipython notebook >>>> >>>> All the logging will go into console (not in notebook), >>>> >>>> If you want to reduce the logging in console, you should change >>>> /opt/spark-1.1.0/conf/log4j.properties >>>> >>>> log4j.rootCategory=WARN, console >>>> og4j.logger.org.apache.spark=WARN >>>> >>>> >>>> On Wed, Oct 1, 2014 at 11:49 AM, Rick Richardson >>>> <rick.richard...@gmail.com> wrote: >>>> > Thanks for your reply. Unfortunately changing the log4j.properties >>>> within >>>> > SPARK_HOME/conf has no effect on pyspark for me. When I change it in >>>> the >>>> > master or workers the log changes have the desired effect, but >>>> pyspark seems >>>> > to ignore them. I have changed the levels to WARN, changed the >>>> appender to >>>> > rolling file, or removed it entirely, all with the same results. >>>> > >>>> > On Wed, Oct 1, 2014 at 1:49 PM, Davies Liu <dav...@databricks.com> >>>> wrote: >>>> >> >>>> >> On Tue, Sep 30, 2014 at 10:14 PM, Rick Richardson >>>> >> <rick.richard...@gmail.com> wrote: >>>> >> > I am experiencing significant logging spam when running PySpark in >>>> >> > IPython >>>> >> > Notebok >>>> >> > >>>> >> > Exhibit A: http://i.imgur.com/BDP0R2U.png >>>> >> > >>>> >> > I have taken into consideration advice from: >>>> >> > >>>> >> > >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Disable-all-spark-logging-td1960.html >>>> >> > >>>> >> > also >>>> >> > >>>> >> > >>>> >> > >>>> http://stackoverflow.com/questions/25193488/how-to-turn-off-info-logging-in-pyspark >>>> >> > >>>> >> > I have only one log4j.properties it is in /opt/spark-1.1.0/conf >>>> >> > >>>> >> > Just before I launch IPython Notebook with a pyspark profile, I >>>> add the >>>> >> > dir >>>> >> > and the properties file directly to CLASSPATH and SPARK_CLASSPATH >>>> env >>>> >> > vars >>>> >> > (as you can also see from the png) >>>> >> > >>>> >> > I still haven't been able to make any change which disables this >>>> >> > infernal >>>> >> > debug output. >>>> >> > >>>> >> > Any ideas (WAGs, Solutions, commiserating) would be greatly >>>> >> > appreciated. >>>> >> > >>>> >> > --- >>>> >> > >>>> >> > My log4j.properties: >>>> >> > >>>> >> > log4j.rootCategory=INFO, console >>>> >> > log4j.appender.console=org.apache.log4j.ConsoleAppender >>>> >> > log4j.appender.console.layout=org.apache.log4j.PatternLayout >>>> >> > log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd >>>> HH:mm:ss} %p >>>> >> > %c{1}: %m%n >>>> >> >>>> >> You should change log4j.rootCategory to WARN, console >>>> >> >>>> >> > # Change this to set Spark log level >>>> >> > log4j.logger.org.apache.spark=INFO >>>> >> > >>>> >> > # Silence akka remoting >>>> >> > log4j.logger.Remoting=WARN >>>> >> > >>>> >> > # Ignore messages below warning level from Jetty, because it's a >>>> bit >>>> >> > verbose >>>> >> > log4j.logger.org.eclipse.jetty=WARN >>>> >> > >>>> >> > >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > >>>> > >>>> > “Science is the great antidote to the poison of enthusiasm and >>>> > superstition.” -- Adam Smith >>>> > >>>> >>> >>> >>> >>> -- >>> >>> >>> “Science is the great antidote to the poison of enthusiasm and >>> superstition.” -- Adam Smith >>> >>> >> >> >> -- >> >> >> “Science is the great antidote to the poison of enthusiasm and >> superstition.” -- Adam Smith >> >> > > > -- > > > “Science is the great antidote to the poison of enthusiasm and > superstition.” -- Adam Smith > > -- “Science is the great antidote to the poison of enthusiasm and superstition.” -- Adam Smith