Hi Stack,
Thanks for your reply!
Two major queries in my system are "select" queries:
1. Select main comments by page, n items per page, ordered by last
update DESC. The last update of a main comment is the timestamp of its
newest sub-comment (if it doesn't have any sub-comment, last update is
the timestamp of main comment).
2. For each main comment, select 2 newest sub-comments, and select all
sub-comments.
With my current schema, the second query can be done easily, but the
first cannot.
I'm thinking about the second index table with this schema:
table "lastpost_index"
row key: <reversed_timestamp_of_last_post>_<id_of_main_comment>
no need any family.
Each time a main comment has new sub comment, i insert a new row into
"lastpost_index" table.
Scanning "lastpost_index" table and parsing row keys i could have a list
of <id_of_main_comment> order by lastpost DESC. However, there is one
problem:
Some main comments are "busy" than others, it means that they have more
sub-comment and are updated more regularly than others. For example:
table "lastpost_index":
key: 123456789_10
key: 123456799_11
key: 123456899_11
key: 123456999_9
the main comment with id 11 has two newer sub-comments than main comment
9. When i scan the table, the main comment 11 appear twice.
So, it's difficult for me to get exactly n newest distinct main comment
using lastpost_index.
Regards,
Eddie Bui
On 4/20/2011 11:30 PM, Stack wrote:
It looks like you keep up a second index table ordered by last post
(row key will have a reversed timestamp of the post's date).
What are the queries you are going to make against the table?
St.Ack
On Tue, Apr 19, 2011 at 11:53 PM, Bui Ngoc Son<gemmountain...@gmail.com> wrote:
Hi everybody,
I am desiging a two-tiers comment system like facebook: the system inlcuded
main comments and each main comment has a various number of sub-comments. My
schema is as follow:
table comments
family "data":
"data:content" - content of main comment
"data:uid" - uid of user who posted this comment
"data:sub_count" - the number of its sub-comment.
family "sub_content"
"sub_content:<cid>": content of the sub-comment which has id=cid
family "sub_uid"
"sub_uid:<cid>": uid of user who posted the sub-comment which has
id=cid
So, each sub-comment has two columns in 2 families with an unique cid. As
well, each main comment has an unique id that is decreased by time (to make
sure that newer main comments are selected first)
However, my boss requires that main comments have to be ordered by last
update first. It is mean that, each time a main comment have a new
sub-comment, it becomes "newer" than the others, and i have to sort the main
comments by the time of their last sub-comments.
How can i design the comment schema to meet my boss's requirement? can i
keep the current schema and use some extra-indexing table? or i have to
re-design the schema?
Thanks in advance
Regards,
Eddie Bui