Hi guys: When i using pyspark, i wanna get data from mysql database. so i want use JDBCRDD. but that is not be supported in PySpark.
For some reasons, i can't using DataFrame API, only can use RDD(datastream) API. Even i know the DataFrame can get data from jdbc source fairly well. So i want to implement functionality that can use rdd to get data from jdbc source for PySpark. But i don't know if that are necessary for PySpark. so we can discuss it. If it is necessary for PySpark, i want to contribute to Spark. i want to create a jira task and hope can get assigned to me. I am a bigdata engineer, like to contribute for open source. I already summit 2 PR for Apache Flink(FLINK-26609, FLINK-26728) and its merged\closed. So i think if i can get the jira ticket, i can implemented it fairly well. thanks. . javaca...@163.com