spark git commit: [SPARK-25471][PYTHON][TEST] Fix pyspark-sql test error when using Python 3.6 and Pandas 0.23

gurwls223 Wed, 19 Sep 2018 18:30:41 -0700

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 7b5da37c0 -> e319a624e



[SPARK-25471][PYTHON][TEST] Fix pyspark-sql test error when using Python 3.6 
and Pandas 0.23

## What changes were proposed in this pull request?

Fix test that constructs a Pandas DataFrame by specifying the column order. 
Previously this test assumed the columns would be sorted alphabetically, 
however when using Python 3.6 with Pandas 0.23 or higher, the original column 
order is maintained. This causes the columns to get mixed up and the test 
errors.

Manually tested with `python/run-tests` using Python 3.6.6 and Pandas 0.23.4

Closes #22477 from BryanCutler/pyspark-tests-py36-pd23-SPARK-25471.

Authored-by: Bryan Cutler <cutl...@gmail.com>
Signed-off-by: hyukjinkwon <gurwls...@apache.org>
(cherry picked from commit 90e3955f384ca07bdf24faa6cdb60ded944cf0d8)
Signed-off-by: hyukjinkwon <gurwls...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e319a624
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e319a624
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e319a624

Branch: refs/heads/branch-2.3
Commit: e319a624e2f366a941bd92a685e1b48504c887b1
Parents: 7b5da37
Author: Bryan Cutler <cutl...@gmail.com>
Authored: Thu Sep 20 09:29:29 2018 +0800
Committer: hyukjinkwon <gurwls...@apache.org>
Committed: Thu Sep 20 09:30:06 2018 +0800

----------------------------------------------------------------------
 python/pyspark/sql/tests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/e319a624/python/pyspark/sql/tests.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 6bfb329..3c5fc97 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -2885,7 +2885,7 @@ class SQLTests(ReusedSQLTestCase):
         import pandas as pd
         from datetime import datetime
         pdf = pd.DataFrame({"ts": [datetime(2017, 10, 31, 1, 1, 1)],
-                            "d": [pd.Timestamp.now().date()]})
+                            "d": [pd.Timestamp.now().date()]}, columns=["d", 
"ts"])
         # test types are inferred correctly without specifying schema
         df = self.spark.createDataFrame(pdf)
         self.assertTrue(isinstance(df.schema['ts'].dataType, TimestampType))


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25471][PYTHON][TEST] Fix pyspark-sql test error when using Python 3.6 and Pandas 0.23

Reply via email to