Re: Query data in Spark RRD

Nikhil Bafna Mon, 23 Feb 2015 00:47:22 -0800

Tathagata - Yes, I'm thinking on that line.

The problem is how to send to send the query to the backend? Bundle a http
server into a spark streaming job, that will accept the parameters?


--
Nikhil Bafna

On Mon, Feb 23, 2015 at 2:04 PM, Tathagata Das <t...@databricks.com> wrote:

> You will have a build a split infrastructure - a front end that takes the
> queries from the UI and sends them to the backend, and the backend (running
> the Spark Streaming app) will actually run the queries on table created in
> the contexts. The RPCs necessary between the frontend and backend will need
> to be implemented by you.
>
> On Sat, Feb 21, 2015 at 11:57 PM, Nikhil Bafna <nikhil.ba...@flipkart.com>
> wrote:
>
>>
>> Yes. As my understanding, it would allow me to write SQLs to query a
>> spark context. But, the query needs to be specified within a job & deployed.
>>
>> What I want is to be able to run multiple dynamic queries specified at
>> runtime from a dashboard.
>>
>>
>>
>> --
>> Nikhil Bafna
>>
>> On Sat, Feb 21, 2015 at 8:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> Have you looked at
>>> http://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD
>>> ?
>>>
>>> Cheers
>>>
>>> On Sat, Feb 21, 2015 at 4:24 AM, Nikhil Bafna <nikhil.ba...@flipkart.com
>>> > wrote:
>>>
>>>>
>>>> Hi.
>>>>
>>>> My use case is building a realtime monitoring system over
>>>> multi-dimensional data.
>>>>
>>>> The way I'm planning to go about it is to use Spark Streaming to store
>>>> aggregated count over all dimensions in 10 sec interval.
>>>>
>>>> Then, from a dashboard, I would be able to specify a query over some
>>>> dimensions, which will need re-aggregation from the already computed job.
>>>>
>>>> My query is, how can I run dynamic queries over data in schema RDDs?
>>>>
>>>> --
>>>> Nikhil Bafna
>>>>
>>>
>>>
>>
>

Re: Query data in Spark RRD

Reply via email to