[ 
https://issues.apache.org/jira/browse/SPARK-38819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543450#comment-17543450
 ] 

Yikun Jiang edited comment on SPARK-38819 at 9/22/22 1:37 AM:
--------------------------------------------------------------

All UT / doctest had been shown in here and submited the PR. It's really hard 
way to keep pandas compatible completely, and there's no good way to do it 
besides done it one by one.

Note that this is only mean current PS test failures had been fixed. There are 
quite a lot pandas new features or bugfixes, that we haven't synced.

What should we do with left? I think it might be:
 * Priority 0: Fix all existing ut/doctest (what we do in this umbrella)
 * Priority 1: Follow the main features/breaking changes:
 ** 
[https://pandas.pydata.org/pandas-docs/version/1.4.2/whatsnew/v1.4.2.html#enhancements|https://pandas.pydata.org/pandas-docs/version/1.4.2/whatsnew/v1.3.0.html#enhancements]
 ** 
[https://pandas.pydata.org/pandas-docs/version/1.4.2/whatsnew/v1.4.2.html#notable-bug-fixes|https://pandas.pydata.org/pandas-docs/version/1.4.2/whatsnew/v1.3.0.html#notable-bug-fixes]
 * Priority 2: Demand trigger only when somebody rasie the new feature/bug in 
jira
 * Priority 3: Follow the all main features and bugfix (impossible in some 
level)

cc [~hyukjin.kwon]  [~XinrongM] [~itholic] [~podongfeng]  Any idea?


was (Author: yikunkero):
All UT / doctest had been shown in here and submited the PR. It's really hard 
way to keep pandas compatible completely, and there's no good way to do it 
besides done it one by one.

Note that this is only mean current PS test failures had been fixed. There are 
quite a lot pandas new features or bugfixes, that we haven't synced.

What should we do with left? I think it might be:
 * Priority 0: Fix all existing ut/doctest (what we do in this umbrella)
 * Priority 1: Follow the main features/breaking changes:
 ** 
[https://pandas.pydata.org/pandas-docs/version/1.4.2/whatsnew/v1.3.0.html#enhancements]
 ** 
[https://pandas.pydata.org/pandas-docs/version/1.4.2/whatsnew/v1.3.0.html#notable-bug-fixes]
 * Priority 2: Demand trigger only when somebody rasie the new feature/bug in 
jira
 * Priority 3: Follow the all main features and bugfix (impossible in some 
level)

cc [~hyukjin.kwon]  [~XinrongM] [~itholic] [~podongfeng]  Any idea?

> Run Pandas on Spark with Pandas 1.4.x
> -------------------------------------
>
>                 Key: SPARK-38819
>                 URL: https://issues.apache.org/jira/browse/SPARK-38819
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Pandas API on Spark, PySpark
>    Affects Versions: 3.4.0
>            Reporter: Yikun Jiang
>            Assignee: Yikun Jiang
>            Priority: Major
>
> This is a umbrella to track issues when pandas upgrade to 1.4.x
>  
> I disable the fast-failed in test, 19 failed:
> [https://github.com/Yikun/spark/pull/88/checks?check_run_id=5873627048]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to