I would do #2. Take a look at this blog which talks about secondary indexes, cardinality, and what it means for cassandra. Secondary indexes in cassandra are a different beast, so often old rules of thumb about indexes don't apply. http://www.wentnet.com/blog/?p=77
On Tue, Jan 28, 2014 at 10:41 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > Generally indexes on binary fields true/false male/female are not terrible > effective. > > > On Tue, Jan 28, 2014 at 12:40 PM, Jimmy Lin <y2klyf+w...@gmail.com> wrote: > >> I have a simple column family like the following >> >> create table people( >> company_id text, >> employee_id text, >> gender text, >> primary key(company_id, employee_id) >> ); >> >> if I want to find out all the "male" employee given a company id, I can do >> >> 1/ >> select * from people where company_id=xxxx' >> and loop through the result efficiently to pick the employee who has >> gender column value equal to "male" >> >> 2/ >> add a seconday index >> create index gender_index on people(gender) >> select * from people where company_id='xxx' and gender='male' >> >> >> I though #2 seems more appropriate, but I also thought the secondary >> index is helping only locating the primary row key, with the select clause >> in #2, is it more efficient than #1 where application responsible loop >> through the result and filter the right content? >> >> ( >> It totally make sense if I only need to find out all the male >> employee(and not within a company) by using >> select * from people where gender='male" >> ) >> >> thanks >> > >