A great presentation by Evan Chan on utilizing Cassandra as Jonathan noted is at: OLAP with Cassandra and Spark http://www.slideshare.net/EvanChan2/2014-07olapcassspark.
On Tue Feb 03 2015 at 10:03:34 AM Jonathan Haddad <j...@jonhaddad.com> wrote: > Write out the rdd to a cassandra table. The datastax driver provides > saveToCassandra() for this purpose. > > On Tue Feb 03 2015 at 8:59:15 AM Adamantios Corais < > adamantios.cor...@gmail.com> wrote: > >> Hi, >> >> After some research I have decided that Spark (SQL) would be ideal for >> building an OLAP engine. My goal is to push aggregated data (to Cassandra >> or other low-latency data storage) and then be able to project the results >> on a web page (web service). New data will be added (aggregated) once a >> day, only. On the other hand, the web service must be able to run some >> fixed(?) queries (either on Spark or Spark SQL) at anytime and plot the >> results with D3.js. Note that I can already achieve similar speeds while in >> REPL mode by caching the data. Therefore, I believe that my problem must be >> re-phrased as follows: "How can I automatically cache the data once a day >> and make them available on a web service that is capable of running any >> Spark or Spark (SQL) statement in order to plot the results with D3.js?" >> >> Note that I have already some experience in Spark (+Spark SQL) as well as >> D3.js but not at all with OLAP engines (at least in their traditional form). >> >> Any ideas or suggestions? >> >> >> *// Adamantios* >> >> >>