This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new c7d9bd2 [SPARK-36348][PYTHON][FOLLOWUP] Complete test_astype for index c7d9bd2 is described below commit c7d9bd2e70c29781678f7809b6848ba3c5bba4ea Author: itholic <haejoon....@databricks.com> AuthorDate: Wed Oct 27 14:53:51 2021 +0900 [SPARK-36348][PYTHON][FOLLOWUP] Complete test_astype for index ### What changes were proposed in this pull request? This is follow-up for https://github.com/apache/spark/pull/34335. ### Why are the changes needed? The previous bug depends on the pandas version, not the Spark version. So the difference is still alive with pandas < 1.3. For example, ```python # Spark 3.2 with pandas 1.2. >>> pidx = pd.Index([10, 20, 15, 30, 45, None], name="x") >>> psidx = ps.Index(pidx) >>> pidx Index([10, 20, 15, 30, 45, None], dtype='object', name='x') >>> psidx Float64Index([10.0, 20.0, 15.0, 30.0, 45.0, nan], dtype='float64', name='x') >>> pidx.astype(str) Index(['10', '20', '15', '30', '45', 'None'], dtype='object', name='x') >>> psidx.astype(str) Index(['10.0', '20.0', '15.0', '30.0', '45.0', 'nan'], dtype='object', name='x') ``` I think many people are still using pandas < 1.3, so maybe we'd better to separate the test for old version of pandas for now. ### Does this PR introduce _any_ user-facing change? No, it's test only ### How was this patch tested? Unittest Closes #34397 from itholic/SPARK-36348-followup. Authored-by: itholic <haejoon....@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- python/pyspark/pandas/tests/indexes/test_base.py | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/python/pyspark/pandas/tests/indexes/test_base.py b/python/pyspark/pandas/tests/indexes/test_base.py index a7f19a7..e7e5216 100644 --- a/python/pyspark/pandas/tests/indexes/test_base.py +++ b/python/pyspark/pandas/tests/indexes/test_base.py @@ -2243,8 +2243,17 @@ class IndexesTest(PandasOnSparkTestCase, TestUtils): pidx = pd.Index([10, 20, 15, 30, 45, None], name="x") psidx = ps.Index(pidx) - self.assert_eq(psidx.astype(bool), pidx.astype(bool)) - self.assert_eq(psidx.astype(str), pidx.astype(str)) + if LooseVersion(pd.__version__) >= LooseVersion("1.3"): + self.assert_eq(psidx.astype(bool), pidx.astype(bool)) + self.assert_eq(psidx.astype(str), pidx.astype(str)) + else: + self.assert_eq( + psidx.astype(bool), ps.Index([True, True, True, True, True, True], name="x") + ) + self.assert_eq( + psidx.astype(str), + ps.Index(["10.0", "20.0", "15.0", "30.0", "45.0", "nan"], name="x"), + ) pidx = pd.Index(["hi", "hi ", " ", " \t", "", None], name="x") psidx = ps.Index(pidx) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org