Hi All, We would like to discuss increasing the minimum supported version of Pandas in Spark, which is currently 0.19.2.
Pandas 0.19.2 was released nearly 3 years ago and there are some workarounds in PySpark that could be removed if such an old version is not required. This will help to keep code clean and reduce maintenance effort. The change is targeted for Spark 3.0.0 release, see https://issues.apache.org/jira/browse/SPARK-28041. The current thought is to bump the version to 0.23.2, but we would like to discuss before making a change. Does anyone else have thoughts on this? Regards, Bryan
