Thanks for your quickly reply. I will test several pypy versions and report the result later.
On Thu, Nov 5, 2015 at 4:06 PM, Josh Rosen <rosenvi...@gmail.com> wrote: > I noticed that you're using PyPy 2.2.1, but it looks like Spark 1.5.1's > docs say that we only support PyPy 2.3+. Could you try using a newer PyPy > version to see if that works? > > I just checked and it looks like our Jenkins tests are running against > PyPy 2.5.1, so that version is known to work. I'm not sure what the actual > minimum supported PyPy version is. Would you be interested in helping to > investigate so that we can update the documentation or produce a fix to > restore compatibility with earlier PyPy builds? > > On Wed, Nov 4, 2015 at 11:56 PM, Chang Ya-Hsuan <sumti...@gmail.com> > wrote: > >> Hi all, >> >> I am trying to run pyspark with pypy, and it is work when using >> spark-1.3.1 but failed when using spark-1.4.1 and spark-1.5.1 >> >> my pypy version: >> >> $ /usr/bin/pypy --version >> Python 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015, 15:18:40) >> [PyPy 2.2.1 with GCC 4.8.4] >> >> works with spark-1.3.1 >> >> $ PYSPARK_PYTHON=/usr/bin/pypy >> ~/Tool/spark-1.3.1-bin-hadoop2.6/bin/pyspark >> Python 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015, 15:18:40) >> [PyPy 2.2.1 with GCC 4.8.4] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> 15/11/05 15:50:30 WARN Utils: Your hostname, xxxxxx resolves to a >> loopback address: 127.0.1.1; using xxx.xxx.xxx.xxx instead (on interface >> eth0) >> 15/11/05 15:50:30 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to >> another address >> 15/11/05 15:50:31 WARN NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> Welcome to >> ____ __ >> / __/__ ___ _____/ /__ >> _\ \/ _ \/ _ `/ __/ '_/ >> /__ / .__/\_,_/_/ /_/\_\ version 1.3.1 >> /_/ >> >> Using Python version 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015) >> SparkContext available as sc, HiveContext available as sqlContext. >> And now for something completely different: ``Armin: "Prolog is a mess.", >> CF: >> "No, it's very cool!", Armin: "Isn't this what I said?"'' >> >>> >> >> error message for 1.5.1 >> >> $ PYSPARK_PYTHON=/usr/bin/pypy >> ~/Tool/spark-1.5.1-bin-hadoop2.6/bin/pyspark >> Python 2.7.3 (2.2.1+dfsg-1ubuntu0.3, Sep 30 2015, 15:18:40) >> [PyPy 2.2.1 with GCC 4.8.4] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> Traceback (most recent call last): >> File "app_main.py", line 72, in run_toplevel >> File "app_main.py", line 614, in run_it >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/shell.py", >> line 30, in <module> >> import pyspark >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/__init__.py", >> line 41, in <module> >> from pyspark.context import SparkContext >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/context.py", >> line 26, in <module> >> from pyspark import accumulators >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/accumulators.py", >> line 98, in <module> >> from pyspark.serializers import read_int, PickleSerializer >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/serializers.py", >> line 400, in <module> >> _hijack_namedtuple() >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/serializers.py", >> line 378, in _hijack_namedtuple >> _old_namedtuple = _copy_func(collections.namedtuple) >> File >> "/home/yahsuan/Tool/spark-1.5.1-bin-hadoop2.6/python/pyspark/serializers.py", >> line 376, in _copy_func >> f.__defaults__, f.__closure__) >> AttributeError: 'function' object has no attribute '__closure__' >> And now for something completely different: ``the traces don't lie'' >> >> is this a known issue? any suggestion to resolve it? or how can I help to >> fix this problem? >> >> Thanks. >> > > -- -- 張雅軒