Re: Spark (SQL) as OLAP engine

Sean McNamara Tue, 03 Feb 2015 10:02:38 -0800

We have gone down a similar path at Webtrends, Spark has worked amazingly well 
for us in this use case.  Our solution goes from REST, directly into spark, and 
back out to the UI instantly.

Here is the resulting product in case you are curious (and please pardon the 
self promotion): 
https://www.webtrends.com/support-training/training/explore-onboarding/

> How can I automatically cache the data once a day...

If you are not memory-bounded you could easily cache the daily results for some 
span of time and re-union them together each time you add new data.  You would 
service queries off the unioned RDD.

> ... and make them available on a web service

>From the unioned RDD you could always step into spark SQL at that point.  Or 
>you could use a simple scatter/gather pattern for this.  As with all things 
>Spark, this is super easy to do: just use aggregate()()!

Cheers,

Sean

On Feb 3, 2015, at 9:59 AM, Adamantios Corais 
<adamantios.cor...@gmail.com<mailto:adamantios.cor...@gmail.com>> wrote:

Hi,

After some research I have decided that Spark (SQL) would be ideal for building 
an OLAP engine. My goal is to push aggregated data (to Cassandra or other 
low-latency data storage) and then be able to project the results on a web page 
(web service). New data will be added (aggregated) once a day, only. On the 
other hand, the web service must be able to run some fixed(?) queries (either 
on Spark or Spark SQL) at anytime and plot the results with D3.js. Note that I 
can already achieve similar speeds while in REPL mode by caching the data. 
Therefore, I believe that my problem must be re-phrased as follows: "How can I 
automatically cache the data once a day and make them available on a web 
service that is capable of running any Spark or Spark (SQL)  statement in order 
to plot the results with D3.js?"

Note that I have already some experience in Spark (+Spark SQL) as well as D3.js 
but not at all with OLAP engines (at least in their traditional form).

Any ideas or suggestions?

// Adamantios

Re: Spark (SQL) as OLAP engine

Reply via email to