This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 23c072d2a0e [SPARK-43517][PYTHON][DOCS] Add a migration guide for namedtuple monkey patch 23c072d2a0e is described below commit 23c072d2a0ef046f45893d9a13f5788e6ec09ea5 Author: Hyukjin Kwon <gurwls...@apache.org> AuthorDate: Tue May 16 11:16:27 2023 +0900 [SPARK-43517][PYTHON][DOCS] Add a migration guide for namedtuple monkey patch ### What changes were proposed in this pull request? This PR proposes to add a migration guide for https://github.com/apache/spark/pull/38700. ### Why are the changes needed? To guide users about the workaround of bringing the namedtuple patch back. ### Does this PR introduce _any_ user-facing change? Yes, it adds the migration guides for end-users. ### How was this patch tested? CI in this PR will test it out. Closes #41177 from HyukjinKwon/update-migration-namedtuple. Authored-by: Hyukjin Kwon <gurwls...@apache.org> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- python/docs/source/migration_guide/pyspark_upgrade.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/python/docs/source/migration_guide/pyspark_upgrade.rst b/python/docs/source/migration_guide/pyspark_upgrade.rst index d06475f9b36..7513d64ef6c 100644 --- a/python/docs/source/migration_guide/pyspark_upgrade.rst +++ b/python/docs/source/migration_guide/pyspark_upgrade.rst @@ -34,6 +34,7 @@ Upgrading from PySpark 3.3 to 3.4 * In Spark 3.4, the ``DataFrame.__setitem__`` will make a copy and replace pre-existing arrays, which will NOT be over-written to follow pandas 1.4 behaviors. * In Spark 3.4, the ``SparkSession.sql`` and the Pandas on Spark API ``sql`` have got new parameter ``args`` which provides binding of named parameters to their SQL literals. * In Spark 3.4, Pandas API on Spark follows for the pandas 2.0, and some APIs were deprecated or removed in Spark 3.4 according to the changes made in pandas 2.0. Please refer to the [release notes of pandas](https://pandas.pydata.org/docs/dev/whatsnew/) for more details. +* In Spark 3.4, the custom monkey-patch of ``collections.namedtuple`` was removed, and ``cloudpickle`` was used by default. To restore the previous behavior for any relevant pickling issue of ``collections.namedtuple``, set ``PYSPARK_ENABLE_NAMEDTUPLE_PATCH`` environment variable to ``1``. Upgrading from PySpark 3.2 to 3.3 --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org