Hi Stack,

Thanks for your reply!

Two major queries in my system are "select" queries:

1. Select main comments by page, n items per page, ordered by last update DESC. The last update of a main comment is the timestamp of its newest sub-comment (if it doesn't have any sub-comment, last update is the timestamp of main comment).

2. For each main comment, select 2 newest sub-comments, and select all sub-comments.

With my current schema, the second query can be done easily, but the first cannot.

I'm thinking about the second index table with this schema:

table "lastpost_index"
    row key: <reversed_timestamp_of_last_post>_<id_of_main_comment>
    no need any family.

Each time a main comment has new sub comment, i insert a new row into "lastpost_index" table. Scanning "lastpost_index" table and parsing row keys i could have a list of <id_of_main_comment> order by lastpost DESC. However, there is one problem: Some main comments are "busy" than others, it means that they have more sub-comment and are updated more regularly than others. For example:

table "lastpost_index":
key: 123456789_10
key: 123456799_11
key: 123456899_11
key: 123456999_9

the main comment with id 11 has two newer sub-comments than main comment 9. When i scan the table, the main comment 11 appear twice.

So, it's difficult for me to get exactly n newest distinct main comment using lastpost_index.

Regards,

Eddie Bui

On 4/20/2011 11:30 PM, Stack wrote:
It looks like you keep up a second index table ordered by last post
(row key will have a reversed timestamp of the post's date).

What are the queries you are going to make against the table?

St.Ack

On Tue, Apr 19, 2011 at 11:53 PM, Bui Ngoc Son<gemmountain...@gmail.com>  wrote:
Hi everybody,

I am desiging a two-tiers comment system like facebook: the system inlcuded
main comments and each main comment has a various number of sub-comments. My
schema is as follow:

table comments
    family "data":
        "data:content" - content of main comment
        "data:uid" - uid of user who posted this comment
        "data:sub_count" - the number of its sub-comment.
    family "sub_content"
        "sub_content:<cid>": content of the sub-comment which has id=cid
    family "sub_uid"
        "sub_uid:<cid>": uid of user who posted the sub-comment which has
id=cid

So, each sub-comment has two columns in 2 families with an unique cid. As
well, each main comment has an unique id that is decreased by time (to make
sure that newer main comments are selected first)

However, my boss requires that main comments have to be ordered by last
update first. It is mean that, each time a main comment have a new
sub-comment, it becomes "newer" than the others, and i have to sort the main
comments by the time of their last sub-comments.

How can i design the comment schema to meet my boss's requirement? can i
keep the current schema and use some extra-indexing table? or i have to
re-design the schema?

Thanks in advance

Regards,

Eddie Bui


Reply via email to