Hi, I'm currently trying to implement an offline message retrieval solution wherein I retrieve messages after a particular timestamp for specific users. My question is will what route should I go for…multple primary keys on an IN clause or using 2i
The current model of the messages table looks something like this CREATE TABLE msg_archive( thread_id varchar, ts timestamp, msg blob, PRIMARY KEY (thread_id, ts)) where thread_id is an alphabetized order of sender and recipient such as "brian|john" Now, in order to retrieve the messages, I will have to retrieve them based on the number of contacts you have and as such the query will look something like this SELECT * FROM msg_archive WHERE thread_id IN ('brian|john', 'brian|james'….) AND ts < 1234567890; Ofcourse the list of friends a user can have can potentially reach around 500 or even worse 1000 so the IN clause can potentially have these large amount of primary keys. The question is will this work well or do I have to modify the schema such that we should incorporate secondary indexes And look something like this instead? CREATE TABLE msg_archive( thread_id varchar, recipient varchar, ts timestamp, msg blob, PRIMARY KEY (thread_id, ts)) CREATE INDEX ON msg_archive (recipient); For the select statement, ofcourse it will be as simple as SELECT * FROM msg_archive WHERE recipient = 'brian' AND ts < 1234567890; Which is actually better in terms of performance? Or are there other suggestions to this kind of model? Thanks! Byron -- Byron Wang Sent with Sparrow (http://www.sparrowmailapp.com/?sig)