Hi all, I have a requirement to integrate a custom data store to be used with Spark ( v2.0.1). It consists of structured data in tables along with the schemas.
Then I want to run SparkSQL queries on the data and provide the data back to the data service. I'm wondering what would be the best way to do this. Is it going to be extending the DataFrame and use the data store I have wrapped as DataFrames or extending the DataFrameReader. It would be ideal if I can do minimal changes to the spark code and write something like a external client to submit jobs using my data to a Spark cluster. I'm completely new to the Spark world. Any help would be much appreciated. -- Thanks, Sachith Withana