[jira] [Resolved] (SPARK-46181) Split scheduled Python build

Hyukjin Kwon (Jira) Wed, 29 Nov 2023 20:43:19 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-46181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon resolved SPARK-46181.
----------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Issue resolved by pull request 44088
[https://github.com/apache/spark/pull/44088]

> Split scheduled Python build
> ----------------------------
>
>                 Key: SPARK-46181
>                 URL: https://issues.apache.org/jira/browse/SPARK-46181
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Project Infra, PySpark
>    Affects Versions: 4.0.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>
> Python 3.12 build fails as below:
> {code}
> /__w/spark/spark/python/pyspark/pandas/utils.py:1015: 
> PandasAPIOnSparkAdviceWarning: `to_pandas` loads all data into the driver's 
> memory. It should only be used if the resulting pandas Series is expected to 
> be small.
>   warnings.warn(message, PandasAPIOnSparkAdviceWarning)
> /__w/spark/spark/python/pyspark/testing/pandasutils.py:401: FutureWarning: 
> `assertPandasOnSparkEqual` will be removed in Spark 4.0.0. Use 
> `ps.testing.assert_frame_equal`, `ps.testing.assert_series_equal` and 
> `ps.testing.assert_index_equal` instead.
>   warnings.warn(
> /__w/spark/spark/python/pyspark/pandas/utils.py:1015: 
> PandasAPIOnSparkAdviceWarning: `to_pandas` loads all data into the driver's 
> memory. It should only be used if the resulting pandas DataFrame is expected 
> to be small.
>   warnings.warn(message, PandasAPIOnSparkAdviceWarning)
> ok (15.809s)
>   test_groupby_rolling_sum 
> (pyspark.pandas.tests.connect.test_parity_ops_on_diff_frames_groupby_rolling.OpsOnDiffFramesGroupByRollingParityTests.test_groupby_rolling_sum)
>  ... ERROR StatusCo
> Had test failures in 
> pyspark.pandas.tests.connect.test_parity_ops_on_diff_frames_groupby_rolling 
> with python3.12; see logs.
> Error:  running /__w/spark/spark/python/run-tests 
> --modules=pyspark-pandas-connect-part2 --parallelism=1 
> --python-executables=pypy3,python3.10,python3.11,python3.12 ; received return 
> code 255
> Error: Process completed with exit code 19.
> {code}
> https://github.com/apache/spark/actions/runs/7034467411/job/19154767856
> I suspect this by OOM. We could split the build.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46181) Split scheduled Python build

Reply via email to