[ 
https://issues.apache.org/jira/browse/SPARK-54745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-54745:
---------------------------------
    Priority: Blocker  (was: Major)

> import pyspark fails on Windows since PySpark 4.1.0
> ---------------------------------------------------
>
>                 Key: SPARK-54745
>                 URL: https://issues.apache.org/jira/browse/SPARK-54745
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 4.1.0
>         Environment: GitHub Runner with Windows and Python 3.11 installed.
>            Reporter: Kristin Cowalcijk
>            Assignee: Kristin Cowalcijk
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 4.2.0, 4.1.1
>
>
> We have observed GitHub Action failures in apache/sedona project since the 
> release of PySpark 4.1.0. All failures happen when running python tests on 
> Windows. Here is an example run: 
> https://github.com/apache/sedona/actions/runs/20313206935/job/58349803118
> The test failed with the following error message:
> {code}
> ============================= test session starts 
> =============================
> platform win32 -- Python 3.11.9, pytest-9.0.2, pluggy-1.6.0
> rootdir: D:\a\sedona\sedona\python
> configfile: pyproject.toml
> plugins: anyio-4.12.0, cov-7.0.0
> collected 0 items / 1 error
> =================================== ERRORS 
> ====================================
> ___________ ERROR collecting tests/utils/test_geomserde_speedup.py 
> ____________
> tests\utils\test_geomserde_speedup.py:32: in <module>
>     from sedona.spark.utils import geometry_serde
> sedona\spark\__init__.py:19: in <module>
>     import pyspark
> .venv\Lib\site-packages\pyspark\__init__.py:71: in <module>
>     from pyspark.accumulators import Accumulator, AccumulatorParam
> .venv\Lib\site-packages\pyspark\accumulators.py:324: in <module>
>     class AccumulatorUnixServer(socketserver.UnixStreamServer):
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> E   AttributeError: module 'socketserver' has no attribute 'UnixStreamServer'
> =========================== short test summary info 
> ===========================
> ERROR tests/utils/test_geomserde_speedup.py - AttributeError: module 
> 'socketserver' has no attribute 'UnixStreamServer'
> !!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection 
> !!!!!!!!!!!!!!!!!!!!
> ============================== 1 error in 0.80s 
> ===============================
> {code}
> The exception was triggered by simply importing pyspark. I found that it is 
> relevant to the following PR introducing UDS to PySpark: 
> https://github.com/apache/spark/pull/50466. Seems that although {{AF_UNIX}} 
> is already supported by Windows but Python standard library has not added 
> support for it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to