[ 
https://issues.apache.org/jira/browse/SPARK-25471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-25471:
------------------------------------

    Assignee: Bryan Cutler

> Fix tests for Python 3.6 with Pandas 0.23+
> ------------------------------------------
>
>                 Key: SPARK-25471
>                 URL: https://issues.apache.org/jira/browse/SPARK-25471
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, Tests
>    Affects Versions: 2.4.0
>            Reporter: Bryan Cutler
>            Assignee: Bryan Cutler
>            Priority: Major
>             Fix For: 2.3.3, 2.4.0, 3.0.0
>
>
> Running pyspark tests causes at least 1 error when using Python 3.6 and 
> Pandas 0.23 or higher.  This is because the Pandas DataFrame constructor can 
> create columns in the defined order, where earlier versions might be in 
> alphabetical order.  This leads to the following failure:
> {noformat}
> ======================================================================
> ERROR: test_create_dataframe_from_pandas_with_timestamp 
> (pyspark.sql.tests.SQLTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/bryan/git/spark/python/pyspark/sql/tests.py", line 3275, in 
> test_create_dataframe_from_pandas_with_timestamp
>     df = self.spark.createDataFrame(pdf, schema="d date, ts timestamp")
>   File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 748, in 
> createDataFrame
>     rdd, schema = self._createFromLocal(map(prepare, data), schema)
>   File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 413, in 
> _createFromLocal
>     data = list(data)
>   File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 730, in 
> prepare
>     verify_func(obj)
>   File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1389, in 
> verify
>     verify_value(obj)
>   File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1370, in 
> verify_struct
>     verifier(v)
>   File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1389, in 
> verify
>     verify_value(obj)
>   File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1383, in 
> verify_default
>     verify_acceptable_types(obj)
>   File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1278, in 
> verify_acceptable_types
>     % (dataType, obj, type(obj))))
> TypeError: field ts: TimestampType can not accept object datetime.date(2018, 
> 9, 19) in type <class 'datetime.date'>
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to