[
https://issues.apache.org/jira/browse/SPARK-54745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-54745:
---------------------------------
Priority: Blocker (was: Major)
> import pyspark fails on Windows since PySpark 4.1.0
> ---------------------------------------------------
>
> Key: SPARK-54745
> URL: https://issues.apache.org/jira/browse/SPARK-54745
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 4.1.0
> Environment: GitHub Runner with Windows and Python 3.11 installed.
> Reporter: Kristin Cowalcijk
> Assignee: Kristin Cowalcijk
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 4.2.0, 4.1.1
>
>
> We have observed GitHub Action failures in apache/sedona project since the
> release of PySpark 4.1.0. All failures happen when running python tests on
> Windows. Here is an example run:
> https://github.com/apache/sedona/actions/runs/20313206935/job/58349803118
> The test failed with the following error message:
> {code}
> ============================= test session starts
> =============================
> platform win32 -- Python 3.11.9, pytest-9.0.2, pluggy-1.6.0
> rootdir: D:\a\sedona\sedona\python
> configfile: pyproject.toml
> plugins: anyio-4.12.0, cov-7.0.0
> collected 0 items / 1 error
> =================================== ERRORS
> ====================================
> ___________ ERROR collecting tests/utils/test_geomserde_speedup.py
> ____________
> tests\utils\test_geomserde_speedup.py:32: in <module>
> from sedona.spark.utils import geometry_serde
> sedona\spark\__init__.py:19: in <module>
> import pyspark
> .venv\Lib\site-packages\pyspark\__init__.py:71: in <module>
> from pyspark.accumulators import Accumulator, AccumulatorParam
> .venv\Lib\site-packages\pyspark\accumulators.py:324: in <module>
> class AccumulatorUnixServer(socketserver.UnixStreamServer):
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> E AttributeError: module 'socketserver' has no attribute 'UnixStreamServer'
> =========================== short test summary info
> ===========================
> ERROR tests/utils/test_geomserde_speedup.py - AttributeError: module
> 'socketserver' has no attribute 'UnixStreamServer'
> !!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection
> !!!!!!!!!!!!!!!!!!!!
> ============================== 1 error in 0.80s
> ===============================
> {code}
> The exception was triggered by simply importing pyspark. I found that it is
> relevant to the following PR introducing UDS to PySpark:
> https://github.com/apache/spark/pull/50466. Seems that although {{AF_UNIX}}
> is already supported by Windows but Python standard library has not added
> support for it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]