CaoYu created SPARK-40502: ----------------------------- Summary: Support dataframe API use jdbc data source in PySpark Key: SPARK-40502 URL: https://issues.apache.org/jira/browse/SPARK-40502 Project: Spark Issue Type: New Feature Components: PySpark Affects Versions: 3.3.0 Reporter: CaoYu
When i using pyspark, i wanna get data from mysql database. so i want use JDBCRDD like java\scala. But that is not be supported in PySpark. For some reasons, i can't using DataFrame API, only can use RDD(datastream) API. Even i know the DataFrame can get data from jdbc source fairly well. So i want to implement functionality that can use rdd to get data from jdbc source for PySpark. *But i don't know if that are necessary for PySpark. so we can discuss it.* {*}If it is necessary for PySpark{*}{*}, i want to contribute to Spark.{*} *i hope this Jira task can assigned to me, so i can start working to implement it.* *if not, please close this Jira task.* *thanks a lot.* -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org