Re: Best design in Cassandra

2010-02-02 Thread Erik Holstad
On Mon, Feb 1, 2010 at 3:31 PM, Brandon Williams dri...@gmail.com wrote:

 On Mon, Feb 1, 2010 at 5:20 PM, Erik Holstad erikhols...@gmail.comwrote:

 Hey!
 Have a couple of questions about the best way to use Cassandra.
 Using the random partitioner + the multi_get calls vs order preservation +
 range_slice calls?


 When you use an OPP, the distribution of your keys becomes your problem.
  If you don't have an even distribution, this will be reflected in the load
 on the nodes, while the RP gives you even distribution.


Yeah, that is why it would be nice to hear if anyone has compared the
performance between the two,
to see if it is worth worrying about your own distribution. I also read that
the random partitioner doesn't
give that great distribution.



 What is the benefit of using multiple families vs super column?


 http://issues.apache.org/jira/browse/CASSANDRA-598 is currrently why I
 prefer simple CFs instead of supercolumns.

Yeah, this is nasty.



 For example in the case of sorting
 in different orders. One good thing that I can see here when using super
 column is that you don't
 have to restart your cluster every time you want to add something new
 order.


 A supercolumn can still only compare subcolumns in a single way.

Yeah, I know that, but you can have a super column per sort order without
having to restart the cluster.


 When http://issues.apache.org/jira/browse/CASSANDRA-44 is completed, you
 will be able to add CFs without restarting.

Looks interesting, but targeted at 0.7, so it is probably going to be a
little while, or?


 -Brandon




-- 
Regards Erik


Re: Best design in Cassandra

2010-02-02 Thread Brandon Williams
On Tue, Feb 2, 2010 at 9:27 AM, Erik Holstad erikhols...@gmail.com wrote:

 A supercolumn can still only compare subcolumns in a single way.

 Yeah, I know that, but you can have a super column per sort order without
 having to restart the cluster.


You get a CompareWith for the columns, and a CompareSubcolumnsWith for
subcolumns.  If you need more column types to get different sort orders, you
need another ColumnFamily.

-Brandon


Re: Best design in Cassandra

2010-02-02 Thread Erik Holstad
On Tue, Feb 2, 2010 at 7:45 AM, Brandon Williams dri...@gmail.com wrote:

 On Tue, Feb 2, 2010 at 9:27 AM, Erik Holstad erikhols...@gmail.comwrote:

 A supercolumn can still only compare subcolumns in a single way.

 Yeah, I know that, but you can have a super column per sort order without
 having to restart the cluster.


 You get a CompareWith for the columns, and a CompareSubcolumnsWith for
 subcolumns.  If you need more column types to get different sort orders, you
 need another ColumnFamily.

Not sure what column types mean. What I want to do is to have a few things
sorted by asc and desc order, like {a,b}, {b,a} and {1,2}, {2,1}


 -Brandon




-- 
Regards Erik


Best design in Cassandra

2010-02-01 Thread Erik Holstad
Hey!
Have a couple of questions about the best way to use Cassandra.
Using the random partitioner + the multi_get calls vs order preservation +
range_slice calls?

What is the benefit of using multiple families vs super column? For example
in the case of sorting
in different orders. One good thing that I can see here when using super
column is that you don't
have to restart your cluster every time you want to add something new order.


-- 
Regards Erik


Re: Best design in Cassandra

2010-02-01 Thread Brandon Williams
On Mon, Feb 1, 2010 at 5:20 PM, Erik Holstad erikhols...@gmail.com wrote:

 Hey!
 Have a couple of questions about the best way to use Cassandra.
 Using the random partitioner + the multi_get calls vs order preservation +
 range_slice calls?


When you use an OPP, the distribution of your keys becomes your problem.  If
you don't have an even distribution, this will be reflected in the load on
the nodes, while the RP gives you even distribution.

What is the benefit of using multiple families vs super column?


http://issues.apache.org/jira/browse/CASSANDRA-598 is currrently why I
prefer simple CFs instead of supercolumns.


 For example in the case of sorting
 in different orders. One good thing that I can see here when using super
 column is that you don't
 have to restart your cluster every time you want to add something new
 order.


A supercolumn can still only compare subcolumns in a single way.

When http://issues.apache.org/jira/browse/CASSANDRA-44 is completed, you
will be able to add CFs without restarting.

-Brandon