Have you thought about user another kind of Database, which supports
volative content for example ?

I am currently thinking about doing something similar. The best and simplest
option at the moment that I can think of is Redis. In redis you have the
option of querying keys with wildcards. Your problem can be done by just
inserting an UUID into Redis for a certain amount of time ( the best is to
tailor this amount of time as an inverse function of the number of keys
existing in Redis).

*With Redis*
What I would do : I cut down time in pieces of X minutes ( 15 minutes, for
example by truncating a timestamp). Let timestampN be the timestamp for the
period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :

One or more people will view Topic 1 then Topic2 then again Topic1 in this
period of 15 minutes
(HINCRBY is the Increment)
H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby>
topics:Topic1:timestampN
viewcount 1
H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby>
topics:Topic2:timestampN
viewcount 1
H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby>
topics:Topic1:timestampN
viewcount 1

Then you just query in the following way :

MGET <http://redis.io/commands/mget> topics:*:timestampN

* is the wildcard, you order by viewcount and you have what you are asking
for !
This is a simplified version of what you should do but personnally I really
like the combination of Cassandra and Redis.


Victor

2011/5/18 Aditya Narayan <ady...@gmail.com>

> I would arrange for memtable flush period in such a manner that the time
> period for which these most viewed discussions are generated equals the
> memtable flush timeperiod, so that the entire row of most viewed discussion
> on a topic is in one or maximum two memtables/ SST tables.
> This would also help minimize several versions of the same column in the
> row parts in different SST tables.
>
>
>
> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan <ady...@gmail.com> wrote:
>
>> *************
>> For a discussions forum, I need to show a page of most viewed discussions.
>>
>> For implementing this, I maintain a count of views of a discussion & when
>> this views count of a discussion passes a certain threshold limit, the
>> discussion Id is added to a row of most viewed discussions.
>>
>> This row of most viewed discussions contains columns with Integer names &
>> values containing serialized lists of Ids of all discussions whose views
>> count equals the Integral name of this column.
>>
>> Thus if the view count of a discussion increases I'll need to move its
>> 'Id' from serialized list in some column to serialized list in another
>> column whose name represents the updated views count on that discussion.
>>
>> Thus I can get the most viewed discussions by getting the appropriate no
>> of columns from one end of this Integer sorted row.
>>
>> ************
>>
>> I wanted to get feedback from you all, to know if this is a good design.
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>

Reply via email to