Hi folks - I'm doing an informal proof-of-concept with Cassandra and I've been 
getting some conflicting information about how my data layout should go.  
Perhaps somebody could point me in the right direction.

I have a column family that will have billions of rows of data.  The data do 
not have any unique identifier intrinsically.  A given row will have, say, 50 
columns, and I'll need to be able to efficiently query on 8-10 of them.

I've been told that I should just pick the most common search item and make 
that my primary key, even though it will not be unique.  That seems contrary to 
the documentation I am seeing online.

>From my reading, it seems like I need a UUID column that will be my primary 
>index, and then I should set up secondary indexes on the 8-10 primary search 
>columns.  Am I understanding this correctly?  Any advice you can offer on this 
>would be tremendously helpful.  I'm quite limited in how specific I can be 
>about the data, of course.

Steve

Reply via email to