This API looks starting from scratch and has no relationship with the existing Java/Scala DataSourceV2 API. Particularly, how can they support SQL?
We have been back and forth on the DataSource V2 design since 2.3, I believe there are some things to learn when introducing the Python DataSource API. Thanks, Cheng Pan > On Jun 16, 2023, at 12:14, Allison Wang <allison.w...@databricks.com.INVALID> > wrote: > > Hi everyone, > > I would like to start a discussion on “Python Data Source API”. > > This proposal aims to introduce a simple API in Python for Data Sources. The > idea is to enable Python developers to create data sources without having to > learn Scala or deal with the complexities of the current data source APIs. > The goal is to make a Python-based API that is simple and easy to use, thus > making Spark more accessible to the wider Python developer community. This > proposed approach is based on the recently introduced Python user-defined > table functions with extensions to support data sources. > > SPIP Doc: > https://docs.google.com/document/d/1oYrCKEKHzznljYfJO4kx5K_Npcgt1Slyfph3NEk7JRU/edit?usp=sharing > > SPIP JIRA: https://issues.apache.org/jira/browse/SPARK-44076 > > Looking forward to your feedback. > > Thanks, > Allison --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org