This API looks starting from scratch and has no relationship with the existing 
Java/Scala DataSourceV2 API. Particularly, how can they support SQL?

We have been back and forth on the DataSource V2 design since 2.3, I believe 
there are some things to learn when introducing the Python DataSource API.

Thanks,
Cheng Pan




> On Jun 16, 2023, at 12:14, Allison Wang <allison.w...@databricks.com.INVALID> 
> wrote:
> 
> Hi everyone,
> 
> I would like to start a discussion on “Python Data Source API”.
> 
> This proposal aims to introduce a simple API in Python for Data Sources. The 
> idea is to enable Python developers to create data sources without having to 
> learn Scala or deal with the complexities of the current data source APIs. 
> The goal is to make a Python-based API that is simple and easy to use, thus 
> making Spark more accessible to the wider Python developer community. This 
> proposed approach is based on the recently introduced Python user-defined 
> table functions with extensions to support data sources.
> 
> SPIP Doc:  
> https://docs.google.com/document/d/1oYrCKEKHzznljYfJO4kx5K_Npcgt1Slyfph3NEk7JRU/edit?usp=sharing
> 
> SPIP JIRA: https://issues.apache.org/jira/browse/SPARK-44076
> 
> Looking forward to your feedback.
> 
> Thanks,
> Allison


---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to