[ https://issues.apache.org/jira/browse/SPARK-27387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Cutler reassigned SPARK-27387: ------------------------------------ Assignee: Bryan Cutler > Replace sqlutils assertPandasEqual with Pandas assert_frame_equal in tests > -------------------------------------------------------------------------- > > Key: SPARK-27387 > URL: https://issues.apache.org/jira/browse/SPARK-27387 > Project: Spark > Issue Type: Bug > Components: PySpark, Tests > Affects Versions: 2.4.1 > Reporter: Bryan Cutler > Assignee: Bryan Cutler > Priority: Major > Fix For: 3.0.0 > > > In PySpark unit tests, sqlutils ReusedSQLTestCase.assertPandasEqual is meant > to check if 2 pandas.DataFrames are equal but it seems for later versions of > Pandas, this can fail if the DataFrame has an array column. This method can > be replaced by {{assert_frame_equal}} from pandas.util.testing. This is what > it is meant for and it will give a better assertion message as well. > The test failure I have seen is: > {noformat} > ====================================================================== > ERROR: test_supported_types > (pyspark.sql.tests.test_pandas_udf_grouped_map.GroupedMapPandasUDFTests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/bryan/git/spark/python/pyspark/sql/tests/test_pandas_udf_grouped_map.py", > line 128, in test_supported_types > self.assertPandasEqual(expected1, result1) > File "/home/bryan/git/spark/python/pyspark/testing/sqlutils.py", line 268, > in assertPandasEqual > self.assertTrue(expected.equals(result), msg=msg) > File "/home/bryan/miniconda2/envs/pa012/lib/python3.6/site-packages/pandas > ... > File "pandas/_libs/lib.pyx", line 523, in > pandas._libs.lib.array_equivalent_object > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org