Re: pyspark with pypy not work for spark-1.5.1

Josh Rosen Thu, 05 Nov 2015 00:06:58 -0800

I noticed that you're using PyPy 2.2.1, but it looks like Spark 1.5.1's
docs say that we only support PyPy 2.3+. Could you try using a newer PyPy
version to see if that works?


I just checked and it looks like our Jenkins tests are running against PyPy
2.5.1, so that version is known to work. I'm not sure what the actual
minimum supported PyPy version is. Would you be interested in helping to
investigate so that we can update the documentation or produce a fix to
restore compatibility with earlier PyPy builds?

On Wed, Nov 4, 2015 at 11:56 PM, Chang Ya-Hsuan <sumti...@gmail.com> wrote:

> Hi all,
>
> I am trying to run pyspark with pypy, and it is work when using
> spark-1.3.1 but failed when using spark-1.4.1 and spark-1.5.1
>
> my pypy version:
>
> $ /usr/bin/pypy --version
> Python 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015, 15:18:40)
> [PyPy 2.2.1 with GCC 4.8.4]
>
> works with spark-1.3.1
>
> $ PYSPARK_PYTHON=/usr/bin/pypy
> ~/Tool/spark-1.3.1-bin-hadoop2.6/bin/pyspark
> Python 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015, 15:18:40)
> [PyPy 2.2.1 with GCC 4.8.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> 15/11/05 15:50:30 WARN Utils: Your hostname, xxxxxx resolves to a loopback
> address: 127.0.1.1; using xxx.xxx.xxx.xxx instead (on interface eth0)
> 15/11/05 15:50:30 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
> another address
> 15/11/05 15:50:31 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Welcome to
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /__ / .__/\_,_/_/ /_/\_\   version 1.3.1
>       /_/
>
> Using Python version 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015)
> SparkContext available as sc, HiveContext available as sqlContext.
> And now for something completely different: ``Armin: "Prolog is a mess.",
> CF:
> "No, it's very cool!", Armin: "Isn't this what I said?"''
> >>>
>
> error message for 1.5.1
>
> $ PYSPARK_PYTHON=/usr/bin/pypy
> ~/Tool/spark-1.5.1-bin-hadoop2.6/bin/pyspark
> Python 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015, 15:18:40)
> [PyPy 2.2.1 with GCC 4.8.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> Traceback (most recent call last):
>   File "app_main.py", line 72, in run_toplevel
>   File "app_main.py", line 614, in run_it
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/shell.py",
> line 30, in <module>
>     import pyspark
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/__init__.py",
> line 41, in <module>
>     from pyspark.context import SparkContext
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/context.py",
> line 26, in <module>
>     from pyspark import accumulators
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/accumulators.py",
> line 98, in <module>
>     from pyspark.serializers import read_int, PickleSerializer
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/serializers.py",
> line 400, in <module>
>     _hijack_namedtuple()
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/serializers.py",
> line 378, in _hijack_namedtuple
>     _old_namedtuple = _copy_func(collections.namedtuple)
>   File
> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/serializers.py",
> line 376, in _copy_func
>     f.__defaults__, f.__closure__)
> AttributeError: 'function' object has no attribute '__closure__'
> And now for something completely different: ``the traces don't lie''
>
> is this a known issue? any suggestion to resolve it? or how can I help to
> fix this problem?
>
> Thanks.
>

Re: pyspark with pypy not work for spark-1.5.1

Reply via email to