Kristin Cowalcijk created SPARK-54745:
-----------------------------------------

             Summary: import pyspark fails on Windows since PySpark 4.1.0
                 Key: SPARK-54745
                 URL: https://issues.apache.org/jira/browse/SPARK-54745
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 4.1.0
         Environment: GitHub Runner with Windows and Python 3.11 installed.
            Reporter: Kristin Cowalcijk


We have observed GitHub Action failures in apache/sedona project since the 
release of PySpark 4.1.0. All failures happen when running python tests on 
Windows. Here is an example run: 
https://github.com/apache/sedona/actions/runs/20313206935/job/58349803118

The test failed with the following error message:

{code}
============================= test session starts =============================
platform win32 -- Python 3.11.9, pytest-9.0.2, pluggy-1.6.0
rootdir: D:\a\sedona\sedona\python
configfile: pyproject.toml
plugins: anyio-4.12.0, cov-7.0.0
collected 0 items / 1 error

=================================== ERRORS ====================================
___________ ERROR collecting tests/utils/test_geomserde_speedup.py ____________
tests\utils\test_geomserde_speedup.py:32: in <module>
    from sedona.spark.utils import geometry_serde
sedona\spark\__init__.py:19: in <module>
    import pyspark
.venv\Lib\site-packages\pyspark\__init__.py:71: in <module>
    from pyspark.accumulators import Accumulator, AccumulatorParam
.venv\Lib\site-packages\pyspark\accumulators.py:324: in <module>
    class AccumulatorUnixServer(socketserver.UnixStreamServer):
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E   AttributeError: module 'socketserver' has no attribute 'UnixStreamServer'
=========================== short test summary info ===========================
ERROR tests/utils/test_geomserde_speedup.py - AttributeError: module 
'socketserver' has no attribute 'UnixStreamServer'
!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
============================== 1 error in 0.80s ===============================
{code}

The exception was triggered by simply importing pyspark. I found that it is 
relevant to the following PR introducing UDS to PySpark: 
https://github.com/apache/spark/pull/50466. Seems that although {{AF_UNIX}} is 
already supported by Windows but Python standard library has not added support 
for it.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to