GitHub user BryanCutler opened a pull request:

    https://github.com/apache/spark/pull/20213

    [SPARK-23018][PYTHON] Fix createDataFrame from Pandas timestamp series 
assignment

    ## What changes were proposed in this pull request?
    
    This fixes createDataFrame from Pandas to only assign modified timestamp 
series back to a copied version of the Pandas DataFrame.  Previously, if the 
Pandas DataFrame was only a reference (e.g. a slice of another) each series 
will still get assigned back to the reference even if it is not a modified 
timestamp column.  This caused the following warning "SettingWithCopyWarning: A 
value is trying to be set on a copy of a slice from a DataFrame."
    
    ## How was this patch tested?
    
    existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/BryanCutler/spark 
pyspark-createDataFrame-copy-slice-warn-SPARK-23018

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20213
    
----
commit bdeead620783df3d5b39897cba7001105b2816a7
Author: Bryan Cutler <cutlerb@...>
Date:   2018-01-09T23:51:25Z

    Changed createDataFrame to only assign series if modified timestamp field

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to