Ruifeng Zheng created SPARK-54936:
-------------------------------------
Summary: Monitor upstream behaviour changes
Key: SPARK-54936
URL: https://issues.apache.org/jira/browse/SPARK-54936
Project: Spark
Issue Type: Umbrella
Components: PySpark, Tests
Affects Versions: 4.2.0
Reporter: Ruifeng Zheng
PySpark suffers a lot from behaviour changes from upstream communities, like
Pandas, PyArrow, Numpy.
We should add tests to monitor the behaviour of key functions/features, like:
* pa.Array.to_pandas
* pa.Array.from_pandas
* pa.Array.cast
* pa.array
* pa.scalar
* pd.Series(data=array_data)
* time zone handling in pyarrow, pandas
* zero copy in pandas<->pyarrow data conversions
* etc
The new tests should be dedicated for upstream, spark stuffs should not be
involved.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]