This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new cb1e1f5cd49a [SPARK-47969][PYTHON][TESTS] Make `test_creation_index` 
deterministic
cb1e1f5cd49a is described below

commit cb1e1f5cd49a612c0c081949759c1f931883c263
Author: Ruifeng Zheng <ruife...@apache.org>
AuthorDate: Tue Apr 23 23:09:10 2024 -0700

    [SPARK-47969][PYTHON][TESTS] Make `test_creation_index` deterministic
    
    ### What changes were proposed in this pull request?
    Make `test_creation_index` deterministic
    
    ### Why are the changes needed?
    it may fail in some env
    ```
    FAIL [16.261s]: test_creation_index 
(pyspark.pandas.tests.frame.test_constructor.FrameConstructorTests.test_creation_index)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/jenkins/python/pyspark/testing/pandasutils.py", line 91, in 
_assert_pandas_equal
        assert_frame_equal(
      File 
"/databricks/python3/lib/python3.11/site-packages/pandas/_testing/asserters.py",
 line 1257, in assert_frame_equal
        assert_index_equal(
      File 
"/databricks/python3/lib/python3.11/site-packages/pandas/_testing/asserters.py",
 line 407, in assert_index_equal
        raise_assert_detail(obj, msg, left, right)
      File 
"/databricks/python3/lib/python3.11/site-packages/pandas/_testing/asserters.py",
 line 665, in raise_assert_detail
        raise AssertionError(msg)
    AssertionError: DataFrame.index are different
    DataFrame.index values are different (40.0 %)
    [left]:  Int64Index([2, 3, 4, 6, 5], dtype='int64')
    [right]: Int64Index([2, 3, 4, 5, 6], dtype='int64')
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    no. test only
    
    ### How was this patch tested?
    ci
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #46200 from zhengruifeng/fix_test_creation_index.
    
    Authored-by: Ruifeng Zheng <ruife...@apache.org>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 python/pyspark/pandas/tests/frame/test_constructor.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/python/pyspark/pandas/tests/frame/test_constructor.py 
b/python/pyspark/pandas/tests/frame/test_constructor.py
index ee010d8f023d..d7581895c6c9 100644
--- a/python/pyspark/pandas/tests/frame/test_constructor.py
+++ b/python/pyspark/pandas/tests/frame/test_constructor.py
@@ -195,14 +195,14 @@ class FrameConstructorMixin:
         with ps.option_context("compute.ops_on_diff_frames", True):
             # test with ps.DataFrame and pd.Index
             self.assert_eq(
-                ps.DataFrame(data=psdf, index=pd.Index([2, 3, 4, 5, 6])),
-                pd.DataFrame(data=pdf, index=pd.Index([2, 3, 4, 5, 6])),
+                ps.DataFrame(data=psdf, index=pd.Index([2, 3, 4, 5, 
6])).sort_index(),
+                pd.DataFrame(data=pdf, index=pd.Index([2, 3, 4, 5, 
6])).sort_index(),
             )
 
             # test with ps.DataFrame and ps.Index
             self.assert_eq(
-                ps.DataFrame(data=psdf, index=ps.Index([2, 3, 4, 5, 6])),
-                pd.DataFrame(data=pdf, index=pd.Index([2, 3, 4, 5, 6])),
+                ps.DataFrame(data=psdf, index=ps.Index([2, 3, 4, 5, 
6])).sort_index(),
+                pd.DataFrame(data=pdf, index=pd.Index([2, 3, 4, 5, 
6])).sort_index(),
             )
 
         # test String Index


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to