Hello. Okay, so I'm working on a project to run analytic processing using
Spark or PySpark. Right now, I connect to the shell and execute my
commands. The very first part of my commands is: create an SQL JDBC
connection and cursor to pull from Apache Phoenix, do some processing on
the returned data, and spit out some output. I want to create a web "gui"
tool kind of a thing where I play around with what SQL query is executed
for my analysis.

I know that I can write my whole Spark program and use spark-submit and
have it accept and argument to be the SQL query I want to execute, but this
means that every time I submit: an SQL connection will be created, query
ran, processing done, output printed, program closes and SQL connection
closes, and then the whole thing repeats if I want to do another query
right away. That will probably cause it to be very slow. Is there a way
where I can somehow have the SQL connection "working" in the backend for
example, and then all I have to do is supply a query from my GUI tool where
it then takes it, runs it, displays the output? I just want to know the big
picture and a broad overview of how would I go about doing this and what
additional technology to use and I'll dig up the rest.

Regards,
Alaa Ali

Reply via email to