I guess you can use the same system, you need two CF for that and I think it's better to use 0.8 because it supports counter :
One CF with UTF8Type called active-topics one CF with UUIDType called topics-seen, then using the same principle : for each timestampN you create : For each visit to Topic1 Topic2 Topic1 You create a TimeUUID and you insert active-topics[topics:timestampN] = {Topic1:whateveryouwant} and : topics-seen[topic:Topic1]={TimeUUID1:whatever} active-topics[topics:timestampN] = {Topic2:whateveryouwant} and : topics-seen[topic:Topic2]={TimeUUID2:whatever} active-topics[topics:timestampN] = {Topic1:whateveryouwant} and : topics-seen[topic:Topic1]={TimeUUID3:whatever} Then when you want to query, you query first all the topics (slice) in active-topics for topics:timestampN and then you get all counts in the topics-seen CF for all topics in active-topics. Not so simple... By the way it adds overhead compared to a simple counter solution but I think it is far more elegant, but this is just my opinion. Victor 2011/5/18 Aditya Narayan <ady...@gmail.com> > Thanks victor! > > Aren't there any good ways by using Cassandra alone ? > > > On Wed, May 18, 2011 at 11:41 PM, openvictor Open <openvic...@gmail.com>wrote: > >> Have you thought about user another kind of Database, which supports >> volative content for example ? >> >> I am currently thinking about doing something similar. The best and >> simplest option at the moment that I can think of is Redis. In redis you >> have the option of querying keys with wildcards. Your problem can be done by >> just inserting an UUID into Redis for a certain amount of time ( the best is >> to tailor this amount of time as an inverse function of the number of keys >> existing in Redis). >> >> *With Redis* >> What I would do : I cut down time in pieces of X minutes ( 15 minutes, for >> example by truncating a timestamp). Let timestampN be the timestamp for the >> period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then : >> >> One or more people will view Topic 1 then Topic2 then again Topic1 in this >> period of 15 minutes >> (HINCRBY is the Increment) >> H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby> >> topics:Topic1:timestampN >> viewcount 1 >> H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby> >> topics:Topic2:timestampN >> viewcount 1 >> H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby> >> topics:Topic1:timestampN >> viewcount 1 >> >> Then you just query in the following way : >> >> MGET <http://redis.io/commands/mget> topics:*:timestampN >> >> * is the wildcard, you order by viewcount and you have what you are asking >> for ! >> This is a simplified version of what you should do but personnally I >> really like the combination of Cassandra and Redis. >> >> >> Victor >> >> 2011/5/18 Aditya Narayan <ady...@gmail.com> >> >>> I would arrange for memtable flush period in such a manner that the time >>> period for which these most viewed discussions are generated equals the >>> memtable flush timeperiod, so that the entire row of most viewed discussion >>> on a topic is in one or maximum two memtables/ SST tables. >>> This would also help minimize several versions of the same column in the >>> row parts in different SST tables. >>> >>> >>> >>> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan <ady...@gmail.com>wrote: >>> >>>> ************* >>>> For a discussions forum, I need to show a page of most viewed >>>> discussions. >>>> >>>> For implementing this, I maintain a count of views of a discussion & >>>> when this views count of a discussion passes a certain threshold limit, the >>>> discussion Id is added to a row of most viewed discussions. >>>> >>>> This row of most viewed discussions contains columns with Integer names >>>> & values containing serialized lists of Ids of all discussions whose views >>>> count equals the Integral name of this column. >>>> >>>> Thus if the view count of a discussion increases I'll need to move its >>>> 'Id' from serialized list in some column to serialized list in another >>>> column whose name represents the updated views count on that discussion. >>>> >>>> Thus I can get the most viewed discussions by getting the appropriate no >>>> of columns from one end of this Integer sorted row. >>>> >>>> ************ >>>> >>>> I wanted to get feedback from you all, to know if this is a good design. >>>> >>>> Thanks >>>> >>>> >>>> >>>> >>>> >>>> >>> >> >