This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 5f7aceb [SPARK-28240][PYTHON] Fix Arrow tests to pass with Python 2.7 and latest PyArrow and Pandas in PySpark 5f7aceb is described below commit 5f7aceb9df472709ffcd3b06d1132be1b077291b Author: HyukjinKwon <gurwls...@apache.org> AuthorDate: Wed Jul 3 17:46:31 2019 +0900 [SPARK-28240][PYTHON] Fix Arrow tests to pass with Python 2.7 and latest PyArrow and Pandas in PySpark ## What changes were proposed in this pull request? In Python 2.7 with latest PyArrow and Pandas, the error message seems a bit different with Python 3. This PR simply fixes the test. ``` ====================================================================== FAIL: test_createDataFrame_with_incorrect_schema (pyspark.sql.tests.test_arrow.ArrowTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/.../spark/python/pyspark/sql/tests/test_arrow.py", line 275, in test_createDataFrame_with_incorrect_schema self.spark.createDataFrame(pdf, schema=wrong_schema) AssertionError: "integer.*required.*got.*str" does not match "('Exception thrown when converting pandas.Series (object) to Arrow Array (int32). It can be caused by overflows or other unsafe conversions warned by Arrow. Arrow safe type check can be disabled by using SQL config `spark.sql.execution.pandas.arrowSafeTypeConversion`.', ArrowTypeError('an integer is required',))" ====================================================================== FAIL: test_createDataFrame_with_incorrect_schema (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/.../spark/python/pyspark/sql/tests/test_arrow.py", line 275, in test_createDataFrame_with_incorrect_schema self.spark.createDataFrame(pdf, schema=wrong_schema) AssertionError: "integer.*required.*got.*str" does not match "('Exception thrown when converting pandas.Series (object) to Arrow Array (int32). It can be caused by overflows or other unsafe conversions warned by Arrow. Arrow safe type check can be disabled by using SQL config `spark.sql.execution.pandas.arrowSafeTypeConversion`.', ArrowTypeError('an integer is required',))" ``` ## How was this patch tested? Manually tested. ``` cd python ./run-tests --python-executables=python --modules pyspark-sql ``` Closes #25042 from HyukjinKwon/SPARK-28240. Authored-by: HyukjinKwon <gurwls...@apache.org> Signed-off-by: HyukjinKwon <gurwls...@apache.org> --- python/pyspark/sql/tests/test_arrow.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/pyspark/sql/tests/test_arrow.py b/python/pyspark/sql/tests/test_arrow.py index 1f96d2c..f533083 100644 --- a/python/pyspark/sql/tests/test_arrow.py +++ b/python/pyspark/sql/tests/test_arrow.py @@ -271,7 +271,7 @@ class ArrowTests(ReusedSQLTestCase): fields[0], fields[1] = fields[1], fields[0] # swap str with int wrong_schema = StructType(fields) with QuietTest(self.sc): - with self.assertRaisesRegexp(Exception, "integer.*required.*got.*str"): + with self.assertRaisesRegexp(Exception, "integer.*required"): self.spark.createDataFrame(pdf, schema=wrong_schema) def test_createDataFrame_with_names(self): --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org