[DISCUSS] SPIP: Python Data Source API

2023-06-15 Thread Allison Wang
Hi everyone, I would like to start a discussion on “Python Data Source API”. This proposal aims to introduce a simple API in Python for Data Sources. The idea is to enable Python developers to create data sources without having to learn Scala or deal with the complexities of the current data

Re: [VOTE] Release Plan for Apache Spark 4.0.0 (June 2024)

2023-06-15 Thread Hyukjin Kwon
I am supportive of setting the timeline for Spark 4.0, and I think it has to be done soon. If my understanding is correct, we better need to set up the goals and major changes to happen in 4.0.0? That one I agree with too. Having a preview sounds good to me too so people can try it out. Given

Re: [VOTE] Release Plan for Apache Spark 4.0.0 (June 2024)

2023-06-15 Thread Xiao Li
Since the vote includes the release date for Spark 4.0, I cast my vote as -1, in light of the discussions from the three other PMCs. Also, considering recent discussions on the dev list, numerous breaking changes, such as Scala 2.13, JDK 17 support, and pandas 2.0 support, will be incorporated

Re: [DISCUSS] SPIP: Add PySpark Test Framework

2023-06-15 Thread Mich Talebzadeh
+1 for me. The SPIP document is well written as well. HTH Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited London United Kingdom view my Linkedin profile