You could build a rest API, but you may have issue if you want to return back arbitrary binary data. A more complex but robust alternative is to use some RPC libraries like Akka, Thrift, etc.
TD On Mon, Feb 23, 2015 at 12:45 AM, Nikhil Bafna <nikhil.ba...@flipkart.com> wrote: > > Tathagata - Yes, I'm thinking on that line. > > The problem is how to send to send the query to the backend? Bundle a http > server into a spark streaming job, that will accept the parameters? > > -- > Nikhil Bafna > > On Mon, Feb 23, 2015 at 2:04 PM, Tathagata Das <t...@databricks.com> > wrote: > >> You will have a build a split infrastructure - a front end that takes the >> queries from the UI and sends them to the backend, and the backend (running >> the Spark Streaming app) will actually run the queries on table created in >> the contexts. The RPCs necessary between the frontend and backend will need >> to be implemented by you. >> >> On Sat, Feb 21, 2015 at 11:57 PM, Nikhil Bafna <nikhil.ba...@flipkart.com >> > wrote: >> >>> >>> Yes. As my understanding, it would allow me to write SQLs to query a >>> spark context. But, the query needs to be specified within a job & deployed. >>> >>> What I want is to be able to run multiple dynamic queries specified at >>> runtime from a dashboard. >>> >>> >>> >>> -- >>> Nikhil Bafna >>> >>> On Sat, Feb 21, 2015 at 8:37 PM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Have you looked at >>>> http://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD >>>> ? >>>> >>>> Cheers >>>> >>>> On Sat, Feb 21, 2015 at 4:24 AM, Nikhil Bafna < >>>> nikhil.ba...@flipkart.com> wrote: >>>> >>>>> >>>>> Hi. >>>>> >>>>> My use case is building a realtime monitoring system over >>>>> multi-dimensional data. >>>>> >>>>> The way I'm planning to go about it is to use Spark Streaming to store >>>>> aggregated count over all dimensions in 10 sec interval. >>>>> >>>>> Then, from a dashboard, I would be able to specify a query over some >>>>> dimensions, which will need re-aggregation from the already computed job. >>>>> >>>>> My query is, how can I run dynamic queries over data in schema RDDs? >>>>> >>>>> -- >>>>> Nikhil Bafna >>>>> >>>> >>>> >>> >> >