[spark] branch master updated: [SPARK-43517][PYTHON][DOCS] Add a migration guide for namedtuple monkey patch

gurwls223 Mon, 15 May 2023 19:18:03 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 23c072d2a0e [SPARK-43517][PYTHON][DOCS] Add a migration guide for 
namedtuple monkey patch
23c072d2a0e is described below

commit 23c072d2a0ef046f45893d9a13f5788e6ec09ea5
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Tue May 16 11:16:27 2023 +0900

    [SPARK-43517][PYTHON][DOCS] Add a migration guide for namedtuple monkey 
patch
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to add a migration guide for 
https://github.com/apache/spark/pull/38700.
    
    ### Why are the changes needed?
    
    To guide users about the workaround of bringing the namedtuple patch back.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, it adds the migration guides for end-users.
    
    ### How was this patch tested?
    
    CI in this PR will test it out.
    
    Closes #41177 from HyukjinKwon/update-migration-namedtuple.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/docs/source/migration_guide/pyspark_upgrade.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/python/docs/source/migration_guide/pyspark_upgrade.rst 
b/python/docs/source/migration_guide/pyspark_upgrade.rst
index d06475f9b36..7513d64ef6c 100644
--- a/python/docs/source/migration_guide/pyspark_upgrade.rst
+++ b/python/docs/source/migration_guide/pyspark_upgrade.rst
@@ -34,6 +34,7 @@ Upgrading from PySpark 3.3 to 3.4
 * In Spark 3.4, the ``DataFrame.__setitem__`` will make a copy and replace 
pre-existing arrays, which will NOT be over-written to follow pandas 1.4 
behaviors.
 * In Spark 3.4, the ``SparkSession.sql`` and the Pandas on Spark API ``sql`` 
have got new parameter ``args`` which provides binding of named parameters to 
their SQL literals.
 * In Spark 3.4, Pandas API on Spark follows for the pandas 2.0, and some APIs 
were deprecated or removed in Spark 3.4 according to the changes made in pandas 
2.0. Please refer to the [release notes of 
pandas](https://pandas.pydata.org/docs/dev/whatsnew/) for more details.
+* In Spark 3.4, the custom monkey-patch of ``collections.namedtuple`` was 
removed, and ``cloudpickle`` was used by default. To restore the previous 
behavior for any relevant pickling issue of ``collections.namedtuple``, set 
``PYSPARK_ENABLE_NAMEDTUPLE_PATCH`` environment variable to ``1``.
 
 
 Upgrading from PySpark 3.2 to 3.3


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43517][PYTHON][DOCS] Add a migration guide for namedtuple monkey patch

Reply via email to