This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch branch-4.1
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.1 by this push:
new 296e6820eddc [SPARK-54153][PYTHON][TESTS][FOLLOWUP] Skip
`test_perf_profiler_data_source` if `pyarrow` is absent
296e6820eddc is described below
commit 296e6820eddcf2adc42a3ca7aa8ebcf387260f08
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Fri Nov 21 14:55:16 2025 -0800
[SPARK-54153][PYTHON][TESTS][FOLLOWUP] Skip
`test_perf_profiler_data_source` if `pyarrow` is absent
### What changes were proposed in this pull request?
This PR aims to skip `test_perf_profiler_data_source` if `pyarrow` is
absent.
### Why are the changes needed?
To recover the failed `PyPy` CIs.
-
https://github.com/apache/spark/actions/workflows/build_python_pypy3.10.yml
- https://github.com/apache/spark/actions/runs/19574648782
-
https://github.com/apache/spark/actions/runs/19574648782/job/56056836234
```
======================================================================
ERROR: test_perf_profiler_data_source
(pyspark.sql.tests.test_udf_profiler.UDFProfiler2Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/__w/spark/spark/python/pyspark/sql/tests/test_udf_profiler.py",
line 609, in test_perf_profiler_data_source
self.spark.read.format("TestDataSource").load().collect()
File "/__w/spark/spark/python/pyspark/sql/classic/dataframe.py", line
469, in collect
sock_info = self._jdf.collectToPython()
File
"/__w/spark/spark/python/lib/py4j-0.10.9.9-src.zip/py4j/java_gateway.py", line
1362, in __call__
return_value = get_return_value(
File "/__w/spark/spark/python/pyspark/errors/exceptions/captured.py",
line 263, in deco
return f(*a, **kw)
File
"/__w/spark/spark/python/lib/py4j-0.10.9.9-src.zip/py4j/protocol.py", line 327,
in get_return_value
raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling
o235.collectToPython.
: org.apache.spark.SparkException:
Error from python worker:
Traceback (most recent call last):
File "/usr/local/pypy/pypy3.10/lib/pypy3.10/runpy.py", line 199, in
_run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/pypy/pypy3.10/lib/pypy3.10/runpy.py", line 86, in
_run_code
exec(code, run_globals)
File "/__w/spark/spark/python/lib/pyspark.zip/pyspark/daemon.py", line
37, in <module>
File "/usr/local/pypy/pypy3.10/lib/pypy3.10/importlib/__init__.py",
line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in
_find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<builtin>/frozen importlib._bootstrap_external", line 897, in
exec_module
File "<frozen importlib._bootstrap>", line 241, in
_call_with_frames_removed
File
"/__w/spark/spark/python/lib/pyspark.zip/pyspark/sql/worker/plan_data_source_read.py",
line 21, in <module>
import pyarrow as pa
ModuleNotFoundError: No module named 'pyarrow'
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #53162 from dongjoon-hyun/SPARK-54153.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 9b0b1ce2d628f18c5dbe85c0de9884960d50f71b)
Signed-off-by: Dongjoon Hyun <[email protected]>
---
python/pyspark/sql/tests/test_udf_profiler.py | 1 +
1 file changed, 1 insertion(+)
diff --git a/python/pyspark/sql/tests/test_udf_profiler.py
b/python/pyspark/sql/tests/test_udf_profiler.py
index 37f4a70fabd2..e6a7bf40b945 100644
--- a/python/pyspark/sql/tests/test_udf_profiler.py
+++ b/python/pyspark/sql/tests/test_udf_profiler.py
@@ -585,6 +585,7 @@ class UDFProfiler2TestsMixin:
for id in self.profile_results:
self.assert_udf_profile_present(udf_id=id,
expected_line_count_prefix=2)
+ @unittest.skipIf(not have_pyarrow, pyarrow_requirement_message)
def test_perf_profiler_data_source(self):
class TestDataSourceReader(DataSourceReader):
def __init__(self, schema):
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]