We use Alain's solution as well to make major operational revisions. We have a "red team" and a "blue team in each AWS region, so we just add and drop datacenters to get where we want to be.
Pretty simple. ml On Tue, Mar 31, 2015 at 8:16 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > People keep asking me if we finally found a solution (even if this is 3+ > years old) so I will just update this thread with our findings. > > We finally achieved doing this thanks to our bigdata and reporting stacks > by storing blobs corresponding to HLL (HyperLogLog) structures. HLL is an > algorithm used by Google, twitter and many more to solve count-distinct > problems. Structures built through this algorithm can be "summed" and give > a good approximation of the UV number. > > Precision you will reach depends on the size of structure you chose > (predictable precision). You can reach fairly acceptable approximation with > small data structures. > > So we basically store a HLL per hour and just "sum" HLL for all the hours > between 2 ranges (you can do it at day level or any other level depending > on your needs). > > Hope this will help some of you, we finally had this (good) idea after > more than 3 years. Actually we use HLL for a long time but the idea of > storing HLL structures instead of counts allow us to request on custom > ranges (at the price of more intelligence on the reporting stack that must > read and smartly sum HLLs stored as blobs). We are happy with it since. > > C*heers, > > Alain > > 2012-01-19 22:21 GMT+01:00 Milind Parikh <milindpar...@gmail.com>: > >> You might want to look at the code in countandra.org; regardless of >> whether you use it. It use a model of dynamic composite keys (although >> static composite keys would have worked as well). For the actual query,only >> one row is hit. This of course only works bc the data model is attuned for >> the query. >> >> Regards >> Milind >> >> /*********************** >> sent from my android...please pardon occasional typos as I respond @ the >> speed of thought >> ************************/ >> >> On Jan 19, 2012 1:31 AM, "Alain RODRIGUEZ" <arodr...@gmail.com> wrote: >> >> Hi thanks for your answer but I don't want to add more layer on top of >> Cassandra. I also have done all of my application without Countandra and I >> would like to continue this way. >> >> Furthermore there is a Cassandra modeling problem that I would like to >> solve, and not just hide. >> >> Alain >> >> >> >> 2012/1/18 Lucas de Souza Santos <lucas...@gmail.com> >> > >> > Why not http://www.countandra.org/ >> > >> > >> > ... >> >> >