Hello. Okay, so I'm working on a project to run analytic processing using Spark or PySpark. Right now, I connect to the shell and execute my commands. The very first part of my commands is: create an SQL JDBC connection and cursor to pull from Apache Phoenix, do some processing on the returned data, and spit out some output. I want to create a web "gui" tool kind of a thing where I play around with what SQL query is executed for my analysis.
I know that I can write my whole Spark program and use spark-submit and have it accept and argument to be the SQL query I want to execute, but this means that every time I submit: an SQL connection will be created, query ran, processing done, output printed, program closes and SQL connection closes, and then the whole thing repeats if I want to do another query right away. That will probably cause it to be very slow. Is there a way where I can somehow have the SQL connection "working" in the backend for example, and then all I have to do is supply a query from my GUI tool where it then takes it, runs it, displays the output? I just want to know the big picture and a broad overview of how would I go about doing this and what additional technology to use and I'll dig up the rest. Regards, Alaa Ali