[ https://issues.apache.org/jira/browse/SPARK-38820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525648#comment-17525648 ]
Yikun Jiang edited comment on SPARK-38820 at 4/26/22 3:18 AM: -------------------------------------------------------------- [https://pandas.pydata.org/docs/whatsnew/v1.4.0.html#index-can-hold-arbitrary-extensionarrays] https://github.com/pandas-dev/pandas/commit/e750c94bf1 was (Author: yikunkero): https://pandas.pydata.org/docs/whatsnew/v1.4.0.html#index-can-hold-arbitrary-extensionarrays > Support Index can hold arbitrary ExtensionArrays > ------------------------------------------------ > > Key: SPARK-38820 > URL: https://issues.apache.org/jira/browse/SPARK-38820 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 3.4.0 > Reporter: Yikun Jiang > Priority: Major > > {code:java} > ERROR [1.717s]: test_astype > (pyspark.pandas.tests.data_type_ops.test_boolean_ops.BooleanExtensionOpsTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 121, in > assertPandasEqual > assert_series_equal( > File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", > line 1019, in assert_series_equal > assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}") > File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", > line 506, in assert_attr_equal > raise_assert_detail(obj, msg, left_attr, right_attr) > AssertionError: Attributes of Series are differentAttribute "dtype" are > different > [left]: CategoricalDtype(categories=[False, True], ordered=False) > [right]: CategoricalDtype(categories=[False, True], ordered=False)The above > exception was the direct cause of the following exception:Traceback (most > recent call last): > File > "/__w/spark/spark/python/pyspark/pandas/tests/data_type_ops/test_boolean_ops.py", > line 746, in test_astype > self.assert_eq(pser.astype("category"), psser.astype("category")) > File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 229, in > assert_eq > self.assertPandasEqual(lobj, robj, check_exact=check_exact) > File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 134, in > assertPandasEqual > raise AssertionError(msg) from e > AssertionError: Attributes of Series are differentAttribute "dtype" are > different > [left]: CategoricalDtype(categories=[False, True], ordered=False) > [right]: CategoricalDtype(categories=[False, True], ordered=False)Left: > Name: this, dtype: category > Categories (2, boolean): [False, True] > categoryRight: > Name: this, dtype: category > Categories (2, object): [False, True] > category {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org