I wrote some thoughts about this on my blog. I think it's still mostly correct:
* http://www.ayogo.com/techblog/2010/04/sorting-in-cassandra/ On Fri, Oct 15, 2010 at 11:14 AM, Wicked J <wickedj2...@gmail.com> wrote: > Hi, > I'm using TimeUUID/Sort by column name mechanism. The column value can > contain text data (in future they may contain image data as well) leading to > the possibility of a row out-growing the RAM capacity. Given this background > my questions are: > > a] How many columns are recommended against one row? Based on my app. needs, > I can imagine having 10 million would be a good starting point for the > max_limit (based on text data). Also note that my app. will use search in > ranges of 100 or 200 columns when there are large number of records(columnar > data) without a caching solution in the front. > b] What partitioner is recommended? so that the load in the cluster nodes is > not largely uneven. > c] Would you recommend changing the TimeUUID/Columnar sort mechanism (with a > change in the data model) to sort using row key mechanism? If so then what > partitioner is recommended? with load not being largely uneven. > > Thanks >