[ https://issues.apache.org/jira/browse/SPARK-39609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562682#comment-17562682 ]
Yikun Jiang commented on SPARK-39609: ------------------------------------- [https://github.com/cloudpipe/cloudpickle/pull/461] cloudpickle didn't supported pypy3.8 yet. > PySpark need to support pypy3.8 to avoid "No module named '_pickle" > ------------------------------------------------------------------- > > Key: SPARK-39609 > URL: https://issues.apache.org/jira/browse/SPARK-39609 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 3.4.0 > Reporter: Yikun Jiang > Priority: Major > > {code:java} > Starting test(pypy3): pyspark.sql.tests.test_arrow (temp output: > /tmp/pypy3__pyspark.sql.tests.test_arrow__jx96qdzs.log) > Traceback (most recent call last): > File "/usr/lib/pypy3.8/runpy.py", line 188, in _run_module_as_main > mod_name, mod_spec, code = _get_module_details(mod_name, _Error) > File "/usr/lib/pypy3.8/runpy.py", line 111, in _get_module_details > __import__(pkg_name) > File "/__w/spark/spark/python/pyspark/__init__.py", line 59, in <module> > from pyspark.rdd import RDD, RDDBarrier > File "/__w/spark/spark/python/pyspark/rdd.py", line 54, in <module> > from pyspark.java_gateway import local_connect_and_auth > File "/__w/spark/spark/python/pyspark/java_gateway.py", line 32, in <module> > from pyspark.serializers import read_int, write_with_length, > UTF8Deserializer > File "/__w/spark/spark/python/pyspark/serializers.py", line 68, in <module> > from pyspark import cloudpickle > File "/__w/spark/spark/python/pyspark/cloudpickle/__init__.py", line 4, in > <module> > from pyspark.cloudpickle.cloudpickle import * # noqa > File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 57, > in <module> > from .compat import pickle > File "/__w/spark/spark/python/pyspark/cloudpickle/compat.py", line 13, in > <module> > from _pickle import Pickler # noqa: F401 > ModuleNotFoundError: No module named '_pickle' > Had test failures in pyspark.sql.tests.test_arrow with pypy3; see logs. {code} > Build latest dockerfile pypy3 upgrade to 3.8 (original is 3.7), but it seems > cloudpickle has a bug. > This may related: > https://github.com/cloudpipe/cloudpickle/commit/8bbea3e140767f51dd935a3c8f21c9a8e8702b7c, > but I try to apply this, also failed. Need a deeper look, if you guys know > the reason of this, pls let me know. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org