This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new b93fd9c074bd [SPARK-47116][INFRA][R] Install proper Python version in 
SparkR Windows build to avoid warnings
b93fd9c074bd is described below

commit b93fd9c074bd9d0dd0d7cc06b1cccf0d8525c9ec
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Wed Feb 21 00:42:21 2024 -0800

    [SPARK-47116][INFRA][R] Install proper Python version in SparkR Windows 
build to avoid warnings
    
    ### What changes were proposed in this pull request?
    
    This PR installs Python 3.11 in SparkR build on Windows.
    
    ### Why are the changes needed?
    
    To remove unrelated warnings: 
(https://github.com/HyukjinKwon/spark/actions/runs/7985005685/job/21802732830
    )
    
    ```
    Traceback (most recent call last):
      File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\runpy.py", line 
183, in _run_module_as_main
        mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
      File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\runpy.py", line 
109, in _get_module_details
        __import__(pkg_name)
      File "D:\a\spark\spark\python\lib\pyspark.zip\pyspark\__init__.py", line 
[53](https://github.com/HyukjinKwon/spark/actions/runs/7985005685/job/21802732830#step:10:54),
 in <module>
      File "D:\a\spark\spark\python\lib\pyspark.zip\pyspark\rdd.py", line 
[54](https://github.com/HyukjinKwon/spark/actions/runs/7985005685/job/21802732830#step:10:55),
 in <module>
      File "D:\a\spark\spark\python\lib\pyspark.zip\pyspark\java_gateway.py", 
line 33, in <module>
      File "D:\a\spark\spark\python\lib\pyspark.zip\pyspark\serializers.py", 
line 69, in <module>
      File 
"D:\a\spark\spark\python\lib\pyspark.zip\pyspark\cloudpickle\__init__.py", line 
1, in <module>
      File 
"D:\a\spark\spark\python\lib\pyspark.zip\pyspark\cloudpickle\cloudpickle.py", 
line 80, in <module>
    ImportError: cannot import name 'CellType' from 'types' 
(C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\types.py)
    ```
    
    SparkR build does not need Python. However, it shows warnings when the 
Python version is too low during the attempt to look up Python Data Sources for 
session initialization. The Windows 2019 runner includes Python 3.7, which 
Spark does not support. Therefore, we simply install the proper Python for 
simplicity.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, dev-only.
    
    ### How was this patch tested?
    
    Manually
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #45196 from HyukjinKwon/python-errors.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 .github/workflows/build_sparkr_window.yml | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/.github/workflows/build_sparkr_window.yml 
b/.github/workflows/build_sparkr_window.yml
index fbaca36f9f87..4375b21be55e 100644
--- a/.github/workflows/build_sparkr_window.yml
+++ b/.github/workflows/build_sparkr_window.yml
@@ -58,6 +58,15 @@ jobs:
         Rscript -e "install.packages(c('knitr', 'rmarkdown', 'testthat', 
'e1071', 'survival', 'arrow', 'xml2'), repos='https://cloud.r-project.org/')"
         Rscript -e "pkg_list <- as.data.frame(installed.packages()[,c(1, 
3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]"
       shell: cmd
+    # SparkR build does not need Python. However, it shows warnings when the 
Python version is too low during
+    # the attempt to look up Python Data Sources for session initialization. 
The Windows 2019 runner
+    # includes Python 3.7, which Spark does not support. Therefore, we simply 
install the proper Python
+    # for simplicity, see SPARK-47116.
+    - name: Install Python 3.11
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.11'
+        architecture: x64
     - name: Build Spark
       run: |
         rem 1. '-Djna.nosys=true' is required to avoid kernel32.dll load 
failure.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to