Jimmy,

The secondary index is getting scanned since you put the column in your
query. The behavior you are looking for is a coming feature called Global
Indexes slated for 3.0. https://issues.apache.org/jira/browse/CASSANDRA-6477

In the meantime, you could build your own lookup table even with this low
of cardinality. If the point is to find everyone of a certain gender in a
company, give this a try.

create table company_gender (
   company_id uuid,
   gender text,
   person_id uuid,
   PRIMARY KEY (company_id, gender)
)

Each company would be a partition and you could find all males or females
with a single query. The bonus is that you would get paging which will be
much more efficient.

Patrick




On Fri, Mar 6, 2015 at 2:56 PM, Jimmy Lin <y2klyf+w...@gmail.com> wrote:

> Hi,
> Ran into RPC timeout exception when execution a query that involve
> secondary index of a Boolean column when for example the company has more
> than 1k person.
>
> select * from company where company_id=xxxx and isMale = true;
>
> such extreme low cardinality of secondary index  like the other docs
> stated, will result in basically 2 large row those values. However, I
> thought since I also bounded the query with my primary partition key, won't
> that be first consulted and then further narrow down the result and be
> efficient?
>
> Also, if I simply do
> select * from company where company_id=xxxx ;
> (without the AND clause on secondary index, it return right away)
>
>
> Or mayb Cassandra server internal always parsing the secondary index
> result first?
>
> thanks
>
>
>
> I have a simple table
>
> create table company {
> company_id uuid,
> person_id uuid,
> isMale Boolean,
> PRIMARY KEY (company_id, person_id)
> )
>
>
>
>
>

Reply via email to