Re: SparkContext SyntaxError: invalid syntax

2016-01-21 Thread Andrew Weiner
t; wrote: > > Do you still need help on the PR? > btw, does this apply to YARN client mode? > > ------ > From: andrewweiner2...@u.northwestern.edu > Date: Sun, 17 Jan 2016 17:00:39 -0600 > Subject: Re: SparkContext SyntaxError: invalid syntax > To: cutl.

Re: SparkContext SyntaxError: invalid syntax

2016-01-19 Thread Felix Cheung
rn.edu > Date: Sun, 17 Jan 2016 17:00:39 -0600 > Subject: Re: SparkContext SyntaxError: invalid syntax > To: cutl...@gmail.com > CC: user@spark.apache.org > > > Yeah, I do think it would be worth explicitly stating this in the docs. I > was going to try to edit the docs mys

Re: SparkContext SyntaxError: invalid syntax

2016-01-18 Thread Andrew Weiner
hwestern.edu > Date: Sun, 17 Jan 2016 17:00:39 -0600 > Subject: Re: SparkContext SyntaxError: invalid syntax > To: cutl...@gmail.com > CC: user@spark.apache.org > > > Yeah, I do think it would be worth explicitly stating this in the docs. I > was going to try to edit th

RE: SparkContext SyntaxError: invalid syntax

2016-01-17 Thread Felix Cheung
Do you still need help on the PR? btw, does this apply to YARN client mode? From: andrewweiner2...@u.northwestern.edu Date: Sun, 17 Jan 2016 17:00:39 -0600 Subject: Re: SparkContext SyntaxError: invalid syntax To: cutl...@gmail.com CC: user@spark.apache.org Yeah, I do think it would be worth

Re: SparkContext SyntaxError: invalid syntax

2016-01-17 Thread Andrew Weiner
Yeah, I do think it would be worth explicitly stating this in the docs. I was going to try to edit the docs myself and submit a pull request, but I'm having trouble building the docs from github. If anyone else wants to do this, here is approximately what I would say: (To be added to

Re: SparkContext SyntaxError: invalid syntax

2016-01-15 Thread Andrew Weiner
Indeed! Here is the output when I run in cluster mode: Traceback (most recent call last): File "pi.py", line 22, in ? raise RuntimeError("\n"+str(sys.version_info) +"\n"+ RuntimeError: (2, 4, 3, 'final', 0) [('PYSPARK_GATEWAY_PORT', '48079'), ('PYTHONPATH',

Re: SparkContext SyntaxError: invalid syntax

2016-01-15 Thread Andrew Weiner
I tried playing around with my environment variables, and here is an update. When I run in cluster mode, my environment variables do not persist throughout the entire job. For example, I tried creating a local copy of HADOOP_CONF_DIR in /home//local/etc/hadoop/conf, and then, in spark-env.sh I

Re: SparkContext SyntaxError: invalid syntax

2016-01-15 Thread Andrew Weiner
Actually, I just found this [ https://issues.apache.org/jira/browse/SPARK-1680], which after a bit of googling and reading leads me to believe that the preferred way to change the yarn environment is to edit the spark-defaults.conf file by adding this line: spark.yarn.appMasterEnv.PYSPARK_PYTHON

Re: SparkContext SyntaxError: invalid syntax

2016-01-15 Thread Andrew Weiner
I finally got the pi.py example to run in yarn cluster mode. This was the key insight: https://issues.apache.org/jira/browse/SPARK-9229 I had to set SPARK_YARN_USER_ENV in spark-env.sh: export SPARK_YARN_USER_ENV="PYSPARK_PYTHON=/home/aqualab/local/bin/python" This caused the PYSPARK_PYTHON

Re: SparkContext SyntaxError: invalid syntax

2016-01-15 Thread Bryan Cutler
Glad you got it going! It's wasn't very obvious what needed to be set, maybe it is worth explicitly stating this in the docs since it seems to have come up a couple times before too. Bryan On Fri, Jan 15, 2016 at 12:33 PM, Andrew Weiner < andrewweiner2...@u.northwestern.edu> wrote: > Actually,

Re: SparkContext SyntaxError: invalid syntax

2016-01-14 Thread Andrew Weiner
Hi Bryan, I ran "$> python --version" on every node on the cluster, and it is Python 2.7.8 for every single one. When I try to submit the Python example in client mode * ./bin/spark-submit --master yarn --deploy-mode client --driver-memory 4g --executor-memory 2g

Re: SparkContext SyntaxError: invalid syntax

2016-01-13 Thread Bryan Cutler
Hi Andrew, There are a couple of things to check. First, is Python 2.7 the default version on all nodes in the cluster or is it an alternate install? Meaning what is the output of this command "$> python --version" If it is an alternate install, you could set the environment variable

Re: SparkContext SyntaxError: invalid syntax

2016-01-13 Thread Andrew Weiner
Thanks for your continuing help. Here is some additional info. *OS/architecture* output of *cat /proc/version*: Linux version 2.6.18-400.1.1.el5 (mockbu...@x86-012.build.bos.redhat.com) output of *lsb_release -a*: LSB Version:

Re: SparkContext SyntaxError: invalid syntax

2016-01-08 Thread Bryan Cutler
Hi Andrew, I know that older versions of Spark could not run PySpark on YARN in cluster mode. I'm not sure if that is fixed in 1.6.0 though. Can you try setting deploy-mode option to "client" when calling spark-submit? Bryan On Thu, Jan 7, 2016 at 2:39 PM, weineran <

Re: SparkContext SyntaxError: invalid syntax

2016-01-08 Thread Andrew Weiner
Now for simplicity I'm testing with wordcount.py from the provided examples, and using Spark 1.6.0 The first error I get is: 16/01/08 19:14:46 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path at