Oh, and one last thingŠthere is no limit on number of partitions, just on partition size really.
Dean On 12/11/12 4:26 PM, "Hiller, Dean" <dean.hil...@nrel.gov> wrote: >Is there any column that would be a good qualifer as a partition key? > >Some people partition by time like every month or every day, and then you >can either have your own secondary indexes that you query into(high >entropy is NOT a big deal here) or PlayOrm can do some for you or you >could use CQL as well. > >Other partitioning schemes are to partition by client. > >The goal is to have less than probably about 5 million rows in a >partition so your wide row index is not too large. > > >Dean > >From: >"stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.co >m>" ><stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.co >m>> >Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >Date: Tuesday, December 11, 2012 3:45 PM >To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >Subject: RE: Primary/secondary index question / best practices? > > >Dean, thank you for your response. To the second half of the query, I¹m >a little concerned about the secondary index approach since the indexes >that I want to create are columns with high entropy. > > > >For example, I would like to query by User name and IP address, values >which are decidedly NOT like the pattern recommended in the Secondary >Index field. The 8-10 columns I need to search by are all high a >similar scatter rate. Since the documentation seems to suggest that this >is a bad idea, what would the correct pattern look like? > > > >In an RDBMS I would just slap an alternate key index on the table and let >it roll. It seems like maybe that is not the right approach for >Cassandra? > > > >Thanks again, > >Steve > > > >-----Original Message----- >From: Hiller, Dean [mailto:dean.hil...@nrel.gov] >Sent: Tuesday, December 11, 2012 4:57 PM >To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> >Subject: Re: Primary/secondary index question / best practices? > > > >Hard to help out on a design without specifics but here is some advice >based on the limited information > > > >Primary key : yes, must be cluster unique. TimeUUID or UUIDŠ.PlayOrm has >very unique TimeUUID like keys as in this one 7AL2S8Y.b1 (b1 is the >hostname and the prefix is a "unique" timestamp but generated to a >shorter string(ah, nice readable primary keys). > > > >There are some patterns you can look into here that may help >https://github.com/deanhiller/playorm/wiki/Patterns-Page > > > >If you can partition your data virtually, it may help a lot so you can >query into the partitions. > > > >Later, > >Dean > > > >From: >"stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.co >m><mailto:stephen.m.thomp...@wellsfargo.com%3cmailto:Stephen.M.Thompson@we >llsfargo.com%3e>" ><stephen.m.thomp...@wellsfargo.com<mailto:stephen.m.thomp...@wellsfargo.co >m<mailto:stephen.m.thomp...@wellsfargo.com%3cmailto:Stephen.M.Thompson@wel >lsfargo.com>>> > >Reply-To: >"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@c >assandra.apache.org%3cmailto:user@cassandra.apache.org%3e>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org<mailto:user@ca >ssandra.apache.org%3cmailto:user@cassandra.apache.org>>> > >Date: Tuesday, December 11, 2012 2:49 PM > >To: >"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@c >assandra.apache.org%3cmailto:user@cassandra.apache.org%3e>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org<mailto:user@ca >ssandra.apache.org%3cmailto:user@cassandra.apache.org>>> > >Subject: Primary/secondary index question / best practices? > > > >m my reading, it seems like I need a UUID column that will be my primary >index, and then I should set up secondary indexes on the 8-10 primary >search columns. Am I understanding this correctly? Any advice you can >offer on this would be tremendously helpful. I¹m quite limited in how >specific I can be about the data, of course.