Out of curiosity, how do you actually launch pyspark in your set-up?

On Wed, Oct 1, 2014 at 3:44 PM, Rick Richardson <rick.richard...@gmail.com>
wrote:

> Here is the other relevant bit of my set-up:
> MASTER=spark://sparkmaster:7077
> IPYTHON_OPTS="notebook --pylab inline --ip=0.0.0.0"
> CASSANDRA_NODES="cassandra1|cassandra2|cassandra3"
> PYSPARK_SUBMIT_ARGS="--master $MASTER --deploy-mode client --num-executors
> 6 --executor-memory 1g --executor-cores 1" ipython notebook
> --profile=pyspark
>
> On Wed, Oct 1, 2014 at 3:41 PM, Rick Richardson <rick.richard...@gmail.com
> > wrote:
>
>> I was starting PySpark as a profile within IPython Notebook as per:
>>
>> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
>>
>>
>> My setup looks like:
>> import os
>> import sys
>>
>> spark_home = os.environ.get('SPARK_HOME', None)
>> if not spark_home:
>>     raise ValueError('SPARK_HOME environment variable is not set')
>> sys.path.insert(0, os.path.join(spark_home, 'python'))
>> sys.path.insert(0, os.path.join(spark_home,
>> 'python/lib/py4j-0.8.2.1-src.zip'))
>> execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))
>>
>> I also have some code to expand all of the jars (and the log4j.property)
>> in SPARK_HOME and CASSANDRA_HOME and add them to the SPARK_CLASSPATH
>>
>>
>> I'll try your launch method and see how that goes.
>>
>> On Wed, Oct 1, 2014 at 3:31 PM, Davies Liu <dav...@databricks.com> wrote:
>>
>>> How do you setup IPython to access pyspark in notebook?
>>>
>>> I did as following, it worked for me:
>>>
>>> $ export SPARK_HOME=/opt/spark-1.1.0/
>>> $ export
>>> PYTHONPATH=/opt/spark-1.1.0/python:/opt/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip
>>> $ ipython notebook
>>>
>>> All the logging will go into console (not in notebook),
>>>
>>> If you want to reduce the logging in console, you should change
>>> /opt/spark-1.1.0/conf/log4j.properties
>>>
>>> log4j.rootCategory=WARN, console
>>> og4j.logger.org.apache.spark=WARN
>>>
>>>
>>> On Wed, Oct 1, 2014 at 11:49 AM, Rick Richardson
>>> <rick.richard...@gmail.com> wrote:
>>> > Thanks for your reply.  Unfortunately changing the log4j.properties
>>> within
>>> > SPARK_HOME/conf has no effect on pyspark for me.  When I change it in
>>> the
>>> > master or workers the log changes have the desired effect, but pyspark
>>> seems
>>> > to ignore them.  I have changed the levels to WARN, changed the
>>> appender to
>>> > rolling file, or removed it entirely, all with the same results.
>>> >
>>> > On Wed, Oct 1, 2014 at 1:49 PM, Davies Liu <dav...@databricks.com>
>>> wrote:
>>> >>
>>> >> On Tue, Sep 30, 2014 at 10:14 PM, Rick Richardson
>>> >> <rick.richard...@gmail.com> wrote:
>>> >> > I am experiencing significant logging spam when running PySpark in
>>> >> > IPython
>>> >> > Notebok
>>> >> >
>>> >> > Exhibit A:  http://i.imgur.com/BDP0R2U.png
>>> >> >
>>> >> > I have taken into consideration advice from:
>>> >> >
>>> >> >
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Disable-all-spark-logging-td1960.html
>>> >> >
>>> >> > also
>>> >> >
>>> >> >
>>> >> >
>>> http://stackoverflow.com/questions/25193488/how-to-turn-off-info-logging-in-pyspark
>>> >> >
>>> >> > I have only one log4j.properties it is in /opt/spark-1.1.0/conf
>>> >> >
>>> >> > Just before I launch IPython Notebook with a pyspark profile, I add
>>> the
>>> >> > dir
>>> >> > and the properties file directly to CLASSPATH and SPARK_CLASSPATH
>>> env
>>> >> > vars
>>> >> > (as you can also see from the png)
>>> >> >
>>> >> > I still haven't been able to make any change which disables this
>>> >> > infernal
>>> >> > debug output.
>>> >> >
>>> >> > Any ideas (WAGs, Solutions, commiserating)  would be greatly
>>> >> > appreciated.
>>> >> >
>>> >> > ---
>>> >> >
>>> >> > My log4j.properties:
>>> >> >
>>> >> > log4j.rootCategory=INFO, console
>>> >> > log4j.appender.console=org.apache.log4j.ConsoleAppender
>>> >> > log4j.appender.console.layout=org.apache.log4j.PatternLayout
>>> >> > log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd
>>> HH:mm:ss} %p
>>> >> > %c{1}: %m%n
>>> >>
>>> >> You should change log4j.rootCategory to WARN, console
>>> >>
>>> >> > # Change this to set Spark log level
>>> >> > log4j.logger.org.apache.spark=INFO
>>> >> >
>>> >> > # Silence akka remoting
>>> >> > log4j.logger.Remoting=WARN
>>> >> >
>>> >> > # Ignore messages below warning level from Jetty, because it's a bit
>>> >> > verbose
>>> >> > log4j.logger.org.eclipse.jetty=WARN
>>> >> >
>>> >> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> >
>>> > “Science is the great antidote to the poison of enthusiasm and
>>> > superstition.”  -- Adam Smith
>>> >
>>>
>>
>>
>>
>> --
>>
>>
>> “Science is the great antidote to the poison of enthusiasm and
>> superstition.”  -- Adam Smith
>>
>>
>
>
> --
>
>
> “Science is the great antidote to the poison of enthusiasm and
> superstition.”  -- Adam Smith
>
>


-- 


“Science is the great antidote to the poison of enthusiasm and
superstition.”  -- Adam Smith

Reply via email to