Hi. My use case is building a realtime monitoring system over multi-dimensional data.
The way I'm planning to go about it is to use Spark Streaming to store aggregated count over all dimensions in 10 sec interval. Then, from a dashboard, I would be able to specify a query over some dimensions, which will need re-aggregation from the already computed job. My query is, how can I run dynamic queries over data in schema RDDs? -- Nikhil Bafna