Re: Using Riak to perform aggregate queries

Paul Barry Mon, 15 Apr 2013 08:09:53 -0700

Chris,

I would agree with Alexander about doing more work up front.

We had a lot of data in SQL we moved to Riak, and like you, our initialinstinct was to keep the data normalised and use M/R to work outaggregate values.

After some time experimenting, I believe the better solution, which wenow use, is to play to the strength of riak and treat it more like aninfinite size store with fast lookup on keys. This means denormalisingyour data and maybe storing the same piece of information in severaldifferent ways to match your later access patterns.

This is antithetical to SQL view of the world, but does allow us toscale much better. In our application of smart meter data, we keep SQLaround for all the low volume data that we like to query in lots ofdifferent ways, and use riak for the high volume but slowly changingstuff. As the latter arrives, we store it in several data structures,pre-computing most of the calculations that would were previously donein SQL on the fly.

M/R as implemented in Riak has applications, but is a poor choice whenyou're starting with 'all the data'. You can help it along by preparingyour data to be M/R friendly.


Paul



Alexander Sicular wrote, On 15/04/13 05:47:

by date via a secondary index query or via riak search. Oh, and
precompute everything. Pick whichever time slice has less keys than the
number of keys that make your queries go boom. If a month is too big do
a week or even a day. Persist all computation in materialized keys like



_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Using Riak to perform aggregate queries

Reply via email to