Kristin Cowalcijk created SPARK-54745:
-----------------------------------------
Summary: import pyspark fails on Windows since PySpark 4.1.0
Key: SPARK-54745
URL: https://issues.apache.org/jira/browse/SPARK-54745
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 4.1.0
Environment: GitHub Runner with Windows and Python 3.11 installed.
Reporter: Kristin Cowalcijk
We have observed GitHub Action failures in apache/sedona project since the
release of PySpark 4.1.0. All failures happen when running python tests on
Windows. Here is an example run:
https://github.com/apache/sedona/actions/runs/20313206935/job/58349803118
The test failed with the following error message:
{code}
============================= test session starts =============================
platform win32 -- Python 3.11.9, pytest-9.0.2, pluggy-1.6.0
rootdir: D:\a\sedona\sedona\python
configfile: pyproject.toml
plugins: anyio-4.12.0, cov-7.0.0
collected 0 items / 1 error
=================================== ERRORS ====================================
___________ ERROR collecting tests/utils/test_geomserde_speedup.py ____________
tests\utils\test_geomserde_speedup.py:32: in <module>
from sedona.spark.utils import geometry_serde
sedona\spark\__init__.py:19: in <module>
import pyspark
.venv\Lib\site-packages\pyspark\__init__.py:71: in <module>
from pyspark.accumulators import Accumulator, AccumulatorParam
.venv\Lib\site-packages\pyspark\accumulators.py:324: in <module>
class AccumulatorUnixServer(socketserver.UnixStreamServer):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E AttributeError: module 'socketserver' has no attribute 'UnixStreamServer'
=========================== short test summary info ===========================
ERROR tests/utils/test_geomserde_speedup.py - AttributeError: module
'socketserver' has no attribute 'UnixStreamServer'
!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
============================== 1 error in 0.80s ===============================
{code}
The exception was triggered by simply importing pyspark. I found that it is
relevant to the following PR introducing UDS to PySpark:
https://github.com/apache/spark/pull/50466. Seems that although {{AF_UNIX}} is
already supported by Windows but Python standard library has not added support
for it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]