[ https://issues.apache.org/jira/browse/SPARK-44101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739502#comment-17739502 ]
Haejoon Lee commented on SPARK-44101: ------------------------------------- I would like to reiterate the key decisions regarding the pandas 2.0 upgrade here: With the major release of pandas 2.0.0 on April 3, 2023, numerous breaking changes have been introduced. So, we have made the decision to postpone addressing these breaking changes until the next major release of Spark, version 4.0.0 to minimize disruptions for our users and provide a more seamless upgrade experience. The pandas 2.0.0 release includes a significant number of updates, such as API removals, changes in API behavior, parameter removals, parameter behavior changes, and bug fixes. We have planned the following approach for each item: - {*}API Removals{*}: Removed APIs will remain deprecated in Spark 3.5.0, provide appropriate warnings, and will be removed in Spark 4.0.0. - {*}API Behavior Changes{*}: APIs with changed behavior will retain the behavior in Spark 3.5.0, provide appropriate warnings, and will align the behavior with pandas in Spark 4.0.0. - {*}Parameter Removals{*}: Removed parameters will remain deprecated in Spark 3.5.0, provide appropriate warnings, and will be removed in Spark 4.0.0. - {*}Parameter Behavior Changes{*}: Parameters with changed behavior will retain the behavior in Spark 3.5.0, provide appropriate warnings, and will align the behavior with pandas in Spark 4.0.0. - {*}Bug Fixes{*}: Bug fixes mainly related to correctness issues will be fixed in pandas 3.5.0. *To recap, all breaking changes related to pandas 2.0.0 will be supported in Spark 4.0.0,* *and will remain deprecated with appropriate errors in Spark 3.5.0.* Will submit a PR that deprecates all APIs and adds warnings very soon. Also cc [~panbingkun] [~bjornjorgensen] FYI > Support pandas 2 > ---------------- > > Key: SPARK-44101 > URL: https://issues.apache.org/jira/browse/SPARK-44101 > Project: Spark > Issue Type: Umbrella > Components: Pandas API on Spark, PySpark > Affects Versions: 3.5.0 > Reporter: Haejoon Lee > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org