Write out the rdd to a cassandra table. The datastax driver provides saveToCassandra() for this purpose.
On Tue Feb 03 2015 at 8:59:15 AM Adamantios Corais < adamantios.cor...@gmail.com> wrote: > Hi, > > After some research I have decided that Spark (SQL) would be ideal for > building an OLAP engine. My goal is to push aggregated data (to Cassandra > or other low-latency data storage) and then be able to project the results > on a web page (web service). New data will be added (aggregated) once a > day, only. On the other hand, the web service must be able to run some > fixed(?) queries (either on Spark or Spark SQL) at anytime and plot the > results with D3.js. Note that I can already achieve similar speeds while in > REPL mode by caching the data. Therefore, I believe that my problem must be > re-phrased as follows: "How can I automatically cache the data once a day > and make them available on a web service that is capable of running any > Spark or Spark (SQL) statement in order to plot the results with D3.js?" > > Note that I have already some experience in Spark (+Spark SQL) as well as > D3.js but not at all with OLAP engines (at least in their traditional form). > > Any ideas or suggestions? > > > *// Adamantios* > > >