Now for simplicity I'm testing with wordcount.py from the provided
examples, and using Spark 1.6.0

The first error I get is:

16/01/08 19:14:46 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl
library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1864)
        at [....]

A bit lower down, I see this error:

16/01/08 19:14:48 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0
(TID 0, mundonovo-priv): org.apache.spark.SparkException:
Error from python worker:
  python: module pyspark.daemon not found
PYTHONPATH was:

/scratch5/hadoop/yarn/local/usercache/awp066/filecache/22/spark-assembly-1.6.0-hadoop2.4.0.jar:/home/jpr123/hg.pacific/python-common:/home/jpr123/python-libs:/home/jpr123/lib/python2.7/site-packages:/home/zsb739/local/lib/python2.7/site-packages:/home/jpr123/mobile-cdn-analysis:/home/awp066/lib/python2.7/site-packages:/scratch4/hadoop/yarn/local/usercache/awp066/appcache/application_1450370639491_0136/container_1450370639491_0136_01_000002/pyspark.zip:/scratch4/hadoop/yarn/local/usercache/awp066/appcache/application_1450370639491_0136/container_1450370639491_0136_01_000002/py4j-0.9-src.zip
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at [....]

And then a few more similar pyspark.daemon not found errors...

Andrew



On Fri, Jan 8, 2016 at 2:31 PM, Bryan Cutler <cutl...@gmail.com> wrote:

> Hi Andrew,
>
> I know that older versions of Spark could not run PySpark on YARN in
> cluster mode.  I'm not sure if that is fixed in 1.6.0 though.  Can you try
> setting deploy-mode option to "client" when calling spark-submit?
>
> Bryan
>
> On Thu, Jan 7, 2016 at 2:39 PM, weineran <
> andrewweiner2...@u.northwestern.edu> wrote:
>
>> Hello,
>>
>> When I try to submit a python job using spark-submit (using --master yarn
>> --deploy-mode cluster), I get the following error:
>>
>> /Traceback (most recent call last):
>>   File "loss_rate_by_probe.py", line 15, in ?
>>     from pyspark import SparkContext
>>   File
>>
>> "/scratch5/hadoop/yarn/local/usercache/<username>/filecache/18/spark-assembly-1.3.1-hadoop2.4.0.jar/pyspark/__init__.py",
>> line 41, in ?
>>   File
>>
>> "/scratch5/hadoop/yarn/local/usercache/<username>/filecache/18/spark-assembly-1.3.1-hadoop2.4.0.jar/pyspark/context.py",
>> line 219
>>     with SparkContext._lock:
>>                     ^
>> SyntaxError: invalid syntax/
>>
>> This is very similar to  this post from 2014
>> <
>> http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-lock-Error-td18233.html
>> >
>> , but unlike that person I am using Python 2.7.8.
>>
>> Here is what I'm using:
>> Spark 1.3.1
>> Hadoop 2.4.0.2.1.5.0-695
>> Python 2.7.8
>>
>> Another clue:  I also installed Spark 1.6.0 and tried to submit the same
>> job.  I got a similar error:
>>
>> /Traceback (most recent call last):
>>   File "loss_rate_by_probe.py", line 15, in ?
>>     from pyspark import SparkContext
>>   File
>>
>> "/scratch5/hadoop/yarn/local/usercache/<username>/appcache/application_1450370639491_0119/container_1450370639491_0119_01_000001/pyspark.zip/pyspark/__init__.py",
>> line 61
>>     indent = ' ' * (min(len(m) for m in indents) if indents else 0)
>>                                                   ^
>> SyntaxError: invalid syntax/
>>
>> Any thoughts?
>>
>> Andrew
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-SyntaxError-invalid-syntax-tp25910.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to