best practices?

Stephen.M.Thompson Tue, 11 Dec 2012 14:46:06 -0800

Dean, thank you for your response.  To the second half of the query, I'm a 
little concerned about the secondary index approach since the indexes that I 
want to create are columns with high entropy.




For example, I would like to query by User name and IP address, values which 
are decidedly NOT like the pattern recommended in the Secondary Index field.   
The 8-10 columns I need to search by are all high a similar scatter rate.  
Since the documentation seems to suggest that this is a bad idea, what would 
the correct pattern look like?



In an RDBMS I would just slap an alternate key index on the table and let it 
roll.   It seems like maybe that is not the right approach for Cassandra?



Thanks again,

Steve



-----Original Message-----
From: Hiller, Dean [mailto:dean.hil...@nrel.gov]
Sent: Tuesday, December 11, 2012 4:57 PM
To: user@cassandra.apache.org
Subject: Re: Primary/secondary index question / best practices?



Hard to help out on a design without specifics but here is some advice based on 
the limited information



Primary key : yes, must be cluster unique.  TimeUUID or UUID....PlayOrm has 
very unique TimeUUID like keys as in this one 7AL2S8Y.b1 (b1 is the hostname 
and the prefix is a "unique" timestamp but generated to a shorter string(ah, 
nice readable primary keys).



There are some patterns you can look into here that may help 
https://github.com/deanhiller/playorm/wiki/Patterns-Page



If you can partition your data virtually, it may help a lot so you can query 
into the partitions.



Later,

Dean



From: 
"stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.com><mailto:stephen.m.thomp...@wellsfargo.com%3cmailto:stephen.m.thomp...@wellsfargo.com%3e>"
 
<stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.com%3cmailto:stephen.m.thomp...@wellsfargo.com>>>

Reply-To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org%3cmailto:user@cassandra.apache.org%3e>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org%3cmailto:user@cassandra.apache.org>>>

Date: Tuesday, December 11, 2012 2:49 PM

To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org%3cmailto:user@cassandra.apache.org%3e>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org%3cmailto:user@cassandra.apache.org>>>

Subject: Primary/secondary index question / best practices?



m my reading, it seems like I need a UUID column that will be my primary index, 
and then I should set up secondary indexes on the 8-10 primary search columns.  
Am I understanding this correctly?  Any advice you can offer on this would be 
tremendously helpful.  I'm quite limited in how specific I can be about the 
data, of course.

RE: Primary/secondary index question / best practices?

Reply via email to