On Sat, Jun 7, 2014 at 10:41 AM, Colin Clark <co...@clark.ws> wrote:

> It's an anti-pattern and there are better ways to do this.
>
>
Entirely possible :)

It would be nice to have a document with a bunch of common cassandra design
patterns.

I've been trying to track down a pattern for this and a lot of this is
pieced in different places an individual blogs posts so one has to reverse
engineer it.


> I have implemented the paging algorithm you've described using wide rows
> and bucketing.  This approach is a more efficient utilization of
> Cassandra's built in wholesome goodness.
>

So.. I assume the general pattern is to:

create a bucket.. you create like 2^16 buckets, this is your partition key.


Then you place a timestamp next to the bucket in a primary key.

So essentially:

primary key( bucket, timestamp )…

.. so to read from this buck you essentially execute:

select * from foo where bucket = 100 and timestamp > 12345790 limit 10000;


>
> Also, I wouldn't let any number of clients (huge) connect directly the
> cluster to do this-put some type of app server in between to handle the
> comm's and fan out.  You'll get better utilization of resources and less
> overhead in addition to flexibility of which data center you're utilizing
> to serve requests.
>
>
this is interesting… since the partition is the bucket, you could make some
poor decisions based on the number of buckets.

For example,

if you use 2^64 buckets, the number of items in each bucket is going to be
rather small.  So you're going to have tons of queries each fetching 0-1
row (if you have a small amount of data).

But if you use very FEW buckets.. say 5, but you have a cluster of 1000
nodes, then you will have 5 of these buckets on 5 nodes, and the rest of
the nodes without any data.

Hm..

the byte ordered partitioner solves this problem because I can just pick a
fixed number of buckets and then this is the primary key prefix and the
data in a bucket can be split up across machines based on any arbitrary
split even in the middle of a 'bucket' …


-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

Reply via email to