[spark] branch master updated: [SPARK-28240][PYTHON] Fix Arrow tests to pass with Python 2.7 and latest PyArrow and Pandas in PySpark

gurwls223 Wed, 03 Jul 2019 01:47:21 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 5f7aceb  [SPARK-28240][PYTHON] Fix Arrow tests to pass with Python 2.7 
and latest PyArrow and Pandas in PySpark
5f7aceb is described below

commit 5f7aceb9df472709ffcd3b06d1132be1b077291b
Author: HyukjinKwon <gurwls...@apache.org>
AuthorDate: Wed Jul 3 17:46:31 2019 +0900

    [SPARK-28240][PYTHON] Fix Arrow tests to pass with Python 2.7 and latest 
PyArrow and Pandas in PySpark
    
    ## What changes were proposed in this pull request?
    
    In Python 2.7 with latest PyArrow and Pandas, the error message seems a bit 
different with Python 3. This PR simply fixes the test.
    
    ```
    ======================================================================
    FAIL: test_createDataFrame_with_incorrect_schema 
(pyspark.sql.tests.test_arrow.ArrowTests)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/.../spark/python/pyspark/sql/tests/test_arrow.py", line 275, in 
test_createDataFrame_with_incorrect_schema
        self.spark.createDataFrame(pdf, schema=wrong_schema)
    AssertionError: "integer.*required.*got.*str" does not match "('Exception 
thrown when converting pandas.Series (object) to Arrow Array (int32). It can be 
caused by overflows or other unsafe conversions warned by Arrow. Arrow safe 
type check can be disabled by using SQL config 
`spark.sql.execution.pandas.arrowSafeTypeConversion`.', ArrowTypeError('an 
integer is required',))"
    
    ======================================================================
    FAIL: test_createDataFrame_with_incorrect_schema 
(pyspark.sql.tests.test_arrow.EncryptionArrowTests)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/.../spark/python/pyspark/sql/tests/test_arrow.py", line 275, in 
test_createDataFrame_with_incorrect_schema
        self.spark.createDataFrame(pdf, schema=wrong_schema)
    AssertionError: "integer.*required.*got.*str" does not match "('Exception 
thrown when converting pandas.Series (object) to Arrow Array (int32). It can be 
caused by overflows or other unsafe conversions warned by Arrow. Arrow safe 
type check can be disabled by using SQL config 
`spark.sql.execution.pandas.arrowSafeTypeConversion`.', ArrowTypeError('an 
integer is required',))"
    
    ```
    
    ## How was this patch tested?
    
    Manually tested.
    
    ```
    cd python
    ./run-tests --python-executables=python --modules pyspark-sql
    ```
    
    Closes #25042 from HyukjinKwon/SPARK-28240.
    
    Authored-by: HyukjinKwon <gurwls...@apache.org>
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
---
 python/pyspark/sql/tests/test_arrow.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyspark/sql/tests/test_arrow.py 
b/python/pyspark/sql/tests/test_arrow.py
index 1f96d2c..f533083 100644
--- a/python/pyspark/sql/tests/test_arrow.py
+++ b/python/pyspark/sql/tests/test_arrow.py
@@ -271,7 +271,7 @@ class ArrowTests(ReusedSQLTestCase):
         fields[0], fields[1] = fields[1], fields[0]  # swap str with int
         wrong_schema = StructType(fields)
         with QuietTest(self.sc):
-            with self.assertRaisesRegexp(Exception, 
"integer.*required.*got.*str"):
+            with self.assertRaisesRegexp(Exception, "integer.*required"):
                 self.spark.createDataFrame(pdf, schema=wrong_schema)
 
     def test_createDataFrame_with_names(self):


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-28240][PYTHON] Fix Arrow tests to pass with Python 2.7 and latest PyArrow and Pandas in PySpark

Reply via email to